I'm implementing separable convolution to speed up 2D Gaussian convolution.
clear all;
close all;
im = randi([0,255],1024,1024);
win = 7;
window = fspecial('gaussian',win,win/6);
[U, S, V] = svd(window);
v = U(:,1) * sqrt(S(1,1));
h = V(:,1)' * sqrt(S(1,1));
out1 = filter2(window, im);
out2 = filter2(h, filter2(v, im));
norm(out1 - out2)
tic
for i = 1:1000
out1 = filter2(window, im);
end
toc
tic
for i = 1:1000
out2 = filter2(h, filter2(v, im));
end
toc
The separable version is supposed to be faster by win*win/(win + win) = 2.5 times, but it is actually slower:
ans =
2.6250e-12
Elapsed time is 5.486270 seconds.
Elapsed time is 8.769868 seconds.
Was there any hidden implementation inside filter2?
filter2in the second loop...it seems like this is the expected result. - gariepyconv2has a three-input version for separable kernels. Maybe that will be faster. You may need to flip some of the inputs to match the result offilter2(filter2computes correlation, not convolution) - Luis Mendo