Vectorized code slower than for loop in Matlab

Question

I have a matrix 8x8 called gimg. I've performed this code for 5 different gimg matrices with this code, one vectorized, the other one in a for loop.

tic
dm = zeros(size(gimg));

for x = 1:size(gimg, 1)
    for y = 1:size(gimg, 2)
        dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
    end
end
toc

tic
[x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));  

dm = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
toc

Here are the results,

Elapsed time is 0.000057 seconds.
Elapsed time is 0.000247 seconds.

Elapsed time is 0.000062 seconds.
Elapsed time is 0.000199 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000195 seconds.

Elapsed time is 0.000055 seconds.
Elapsed time is 0.000192 seconds.

Elapsed time is 0.000056 seconds.
Elapsed time is 0.000187 seconds.

Is it because of the ones matrix?

I find that the feature acceleration in matlab changes the times dramatically for for loops. So my question is, is it worth to vectorize the code now with this features from JIT compiler?

UPDATE: this is one example of my gimg matrix

gimg =

         259          42           0           0           0           0           0           0
          42        1064          41           0           0           0           0           0
           0          55        3444         196           0           0           0           0
           0           0         215        3581          47           0           0           0
           0           0           0         100         806           3           0           0
           0           0           0           0           3           2           0           0
           0           0           0           0           0           0           0           0
           0           0           0           0           0           0           0           0

UPDATE 2: results from @Divakar code

>> test_vct
------------------------ With Original Loopy Approach
Elapsed time is 5.269883 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 6.314792 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 3.146764 seconds.
>>

So, in my computer the original vectorized approach is still slower.

My computer specs and Matlab version

Matlab 2015a
Windows 8.1 x64
Intel i7 860 2.80 Ghz
16 Gb RAM
Nvidia Geforce GTS250

Use timeit for reliable timing results or use a number of trials on top of the existing codes. — Divakar
when comparing performance like that, you should detail your Matlab version and computer type (32/64 bits, which processor etc...) — Hoki
Also, it seems you don't really need that ones, replacing it with 1 would work. — Divakar
@SamuelNLP this will probably also depending on the size of your gimg matrices. If you plot these times as a graph of matrix size you may find that they cross over. Regarding is it worth vectorizing... with your code taking fractions of a second I would say that unless you are repeating this millions of times, just go with whichever is quicker for you to implement and debug. You should only do complicated optimization if you are actually facing speed efficiency problems. Otherwise rather optimize for code maintainability and development time — Dan

Divakar Divakar · Accepted Answer · 2015-04-23T14:34:39

This seems faster than both of those -

dm = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);

Benchmarking code -

%// Random input
gimg = rand(8,8);

%// Number of trials (keep this a big number, as so to get runtimes of 1sec+
num_iter = 100000;

disp('------------------------ With Original Loopy Approach')
tic
for iter = 1:num_iter
    dm = zeros(size(gimg));     
    for x = 1:size(gimg, 1)
        for y = 1:size(gimg, 2)
            dm(x, y) = (1/(1 + (x - y)^2))*gimg(x,y);
        end
    end
end
toc

disp('------------------------ With Original Vectorized Approach')
tic
for iter = 1:num_iter
    [x,y] = ndgrid(1:size(gimg, 1),1:size(gimg, 2));
    dm2 = (ones(size(gimg))./(1 + (x - y).^2)).*gimg;
end
toc

disp('------------------------ With Proposed Vectorized Approach')
tic
for iter = 1:num_iter
    dm3 = (1./(1+bsxfun(@minus,[1:size(gimg, 1)]',1:size(gimg, 2)).^2).*gimg);
end
toc

Results -

------------------------ With Original Loopy Approach
Elapsed time is 4.996531 seconds.
------------------------ With Original Vectorized Approach
Elapsed time is 2.684011 seconds.
------------------------ With Proposed Vectorized Approach
Elapsed time is 1.338118 seconds.

Vectorized code slower than for loop in Matlab

1 Answers