Transporting vectorized Matlab code to python, numpy

Question

I am transporting my matlab code to python. There are alot of things that I am trying to find replacements for in python and numpy

Matlab Code:

    [m,n]=size(Image);

    canvas=zeros(m,n);

    U_res_sel=squeeze(loading);
    [s1,s2]=size(U_res_sel);

    if mod(s1,2)==0
        dy=s1/2-1;
    else
        dy=(s1-1)/2;
    end
    if mod(s2,2)==0
        dx=s2/2-1;
    else
        dx=(s2-1)/2;
    end

    xmin=dx+1;
    ymin=dy+1;
    xmax=n-(s2-dx-1);
    ymax=m-(s1-dy-1);

    [x,y]=meshgrid(xmin:xmax,ymin:ymax);

    ind=sub2ind([m,n],y(:),x(:));

    nps=repmat(((-dx+(0:s2-1))*m-dy),s1,1)+repmat((0:(s1-1)).',1,s2);

    ind=repmat(ind,1,numel(nps))+repmat(nps(:).',numel(ind),1);

    A=(Image(ind)-repmat(mean(Image(ind),2),1,numel(nps)));

    B=repmat((U_res_sel(:)-mean(U_res_sel(:))).',size(ind,1),1);

    canvas(ymin:ymax,xmin:xmax)=reshape(sum(A.*B,2)./sqrt(sum((A.^2),2).*sum((B.^2),2)),ymax-ymin+1,[]);

    canvas = canvas(y1+1:z1+y1+1,y2+1:z2+y2+1);

I guess i will explain line by line where problems are occurring.

i am using

import numpy as np

for the numpy functions

1.

The variables work fine until i reach the meshgrid line in python.

    [x,y] = np.mgrid[xmin:xmax,ymin:ymax]

x in matlab has a size 517,517 using test data x in python has a size has 516 by 516 so i changed the python section of the code with

    xmax=n-(s2-dx-1) + 1
    ymax=m-(s1-dy-1) + 1

    [y,x] = np.mgrid[xmin:xmax,ymin:ymax]

to make it the same data sets as the matlab code. But I am not sure if the indexing is sound. If i have the exact same matrix in matlab to numpy array are they equivalent?

2.

The next matlab line is a mess for me to figure out.

    ind=sub2ind([m,n],y(:),x(:));

for the x(:) and y(:) i am using

    x = np.reshape(x,(x.size,1))
    y = np.reshape(y,(y.size,1))
    x = np.int64(x)
    y = np.int64(y)

for the sub2ind function i am using in python

    ind = np.ravel_multi_index((y,x), dims=(m,n) )

but here is where the numbers mess up.

in matlab i get a column vector with a range 3723 to 278760 for a certain data set and for the same data set in python i get a column vector of 4264 to 279292 with a different stepping in between.

they both are the same size of (267289,1) though.

3.

This line i got working fine in matlab and python I am just putting it here so i can be concise to myself.

matlab:

    nps=repmat(((-dx+(0:s2-1))*m-dy),s1,1)+repmat((0:(s1-1)).',1,s2);

python:

    dtx = (-dx + np.arange(0,s2,1))
    dtx_2 = np.arange(0,s1,1)
    dtx_2 = np.reshape(dtx_2,(dtx_2.size,1))

    nps = np.tile(   dtx*m-dy,(s1,1)  ) + np.tile(   dtx_2  ,(1,s2)  )

(4).

for line in matlab

    ind=repmat(ind,1,numel(nps))+repmat(nps(:).',numel(ind),1);

in python i am trying

    a = np.tile(ind,(1,nps.size))
    b = np.tile( np.transpose(dtind) , (ind.size,1) )
    ind = a + b

I split it up into a and b to make it more readable.

But nothing in this is really working as intended.

(5).

I am unsure how to access a variable by index in python.

In matlab i could just do Image(ind), but is my code useless now in python because i can't find an alternative to this?

A bit of a note if you try running the matlab code, is that it will cause your computer and matlab to freeze and crash without warning if you run it on big data sets. I remedied this situation by by putting the code in a wrapper that indexes smaller portions of the data to get the final large image which prevents memory overflows.

I hope i made my mess of code clear enough. This code works fully well in matlab, but matlab is very terrible to be basing my projects in mainly because I can't give my code to other people.

EDIT:

This function is a vectorized program that basically does: (this is pseudocode so the indexing might not be right off the top of my head)

It also does not have padding in this segment. The loading variable i useis a gaussian matrix or a lapacian that can range from 6x6 to 64x64. It can be any size really, as long as its smaller than the image.

    correlation_coeficcient_surface = function(Image,loading)
    [m,n] = size(image)
    [u1,u2] = size(loading)
    canvas = zeros(size(image))
    for yii = 1:n
         for xii = 1:m
              image_segment = Image(yii-floor(u1/2):yii+ceil(u1/2),xii-floor(u2/2):xii+ceil(u2/2));
              if(size(image_segment) == size(loading)
                  canvas(yii-floor(u1/2):yii+ceil(u1/2),xii-floor(u2/2):xii+ceil(u2/2)) = corr2(Image_segment,loading);
              end
          end
     end


    end

It got to be vectorized because iterating over every element made it take really long amounts of time on large images.

Edit Edit:

Here is what my filter actually does.

This is a sample Image

http://i.imgur.com/o9kV3nK.png

This is a sample loading shape that i use to correlate with

http://i.imgur.com/oYW3k2K.png

this is after my matlab filter is finished the images are not aligned, i am just cropping examples to show you what it does shapewise.

http://i.imgur.com/aa4ljue.png

this is scipy.signal.convolve2d which does something that i am not intending.

http://i.imgur.com/vJhXMam.png

If you describe in code what your code is trying to do, it makes it a lot easier for others to read it and help you. — Bi Rico
This looks like a 'sliding window' filter. docs.scipy.org/doc/scipy/reference/ndimage.html The scipy ndimage package might apply. numpy striding may also help: stackoverflow.com/a/4947453/901925 — hpaulj
I posted pictures of what my code actually does. If there is an existing filter that does it more efficiently than that would be more preferable. I could not find anything that did it in matlab though which made me to write my own. — Thoroniul

nbren12 nbren12 · Accepted Answer · 2014-06-26T21:26:03

I think you should slow down a little, and read some materials about the basics of python arrays like indexing and broadcasting. Firstly, I woud read the basic tutorial at http://www.sam.math.ethz.ch/~raoulb/teaching/PythonTutorial/intro_numpy.html. Also http://mathesaurus.sourceforge.net/matlab-numpy.html contains a table with correspondences between certain numpy and matlab operations. However, in general, I would keep an open mind and realize that the matlab-way is often not the best way in numpy.

I won't respond to all your questions directly, but here are the following thoughts.

Python indexing is zero indexed, . Therefore matlab arr(1:100) is the same as numpy arr[0:100].
You can index numpy arrays with either logical arrays or arrays of indices like just like in matlab
In general, the functionality of repmat is handled in numpy by intelligent broadcasting, that doesn't require calling tile manually. For instance, the following code makes a random 100x100 array, calculates the row mean, and subtracts the row mean from the row (e.g. centers the data):
```
arr = np.random.rand(100,100)
mu  = arr.mean(axis=1)
centered = arr - mu[:,None]
```
The mu[:,None] array has size (100,1), and numpy is smart enough to "broadcast" it to size (100,100) to calculate centered. Continuing the example, mu[:,None,None] would have size (100,1,1).
Matlab's size(arr) is the same as numpy's arr.shape.

Good luck!

Edit: For instance you can do your #3 more concisely:

nps = (-dx+np.arange(s2)*m -du)[None,:] + np.arange(s1)[:,None]

And #4:

ind=  ind[:, None] + nps.ravel()[None, :]

Transporting vectorized Matlab code to python, numpy

3 Answers

"""