0
votes

I have two signals x1 and x2. I'm trying to do convolution once using CONV(x1,X2) directly and once using fft and ifft, and compare execution time of both operations.

I don't know why the execution time of fft is not faster than using conv.

n = 0: 2^15 ;
x1(n+1) = (0.25).^n ;
x2(n+1) = 1;

tic
Time_Convolution = conv(x1,x2);
toc


Padding = (length(x1)+ length(x2) - 1)-length(x1) ;
x1_before_fft = [ x1 zeros(1, Padding) ];
x2_before_fft = [ x2 zeros(1,  Padding)];

tic
Convolution = ifft(fft( x1_before_fft).*fft(x2_before_fft));   
toc

Here is the output

Elapsed time is 0.010414 seconds. Elapsed time is 0.017308 seconds.

1
1: Use timeit to measure computation time, tic/toc is not accurate in this context. 2: your data is all zeros except for one element, this is likely going to affect your computations too. - Cris Luengo
Accepting the point made by @CrisLuengo about timeit, 0.010 seconds is quite a lot faster than 0.017 seconds. - bazza
Why wouldn't execution time with fft be less? What makes you think conv is not optimized, perhaps using an FFT internally? - Luis Mendo
@bazza: running this code by copy-pasting in the command line, you can see timing differences much, much larger than these. If you run this in a fresh MATLAB session, you might even see some simple function take a second while the M-file is loaded. - Cris Luengo
Using Octave, I see conv taking 2.0s, fft taking 0.012s. I think @Luis is right, you're seeing a very nifty implementation of conv. - Cris Luengo

1 Answers

1
votes

The key here is that one of your input signals contains mostly zeros, as you're losing precision to roundoff from raising 0.25 to large powers:

 >> nnz(x1) / numel(x1)
 ans =
     0.0164

Less than 2% of your first input is non-zero. A direct implementation of convolution can exploit this to get rid of a lot of operations that contribute nothing to the result. An FFT-based convolution, however, will always do just about the same amount of work regardless of the magnitude of the coefficients involved.

The story is quite different when you make those coefficients non-zero. On my machine, I see:

 >> tic; for k = 1:100, conv(x1,x2); end, toc
 Elapsed time is 0.291373 seconds.
 >> x1 = x1 + eps;
 >> tic; for k = 1:100, conv(x1,x2); end, toc
 Elapsed time is 3.937819 seconds.

Another thing to note is that you are using an unfortunate choice of FFT size to do the convolution via FFT. Any transform that uses at least length(x1)+length(x2)-1 points will do the trick (you just have to trim off any extra coefficients at the end if you have a larger transform), so it's best to pick one that has small prime factors. In this case, however, length(x1)+length(x2)-1 is itself prime, so it's the worst possible choice. Look at the difference you can see by just increasing that length by 1:

 >> N = length(x1)+length(x2)-1;  % Original size.
 >> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
 Elapsed time is 1.036913 seconds.
 >> N = length(x1)+length(x2);  % Better size.
 >> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
 Elapsed time is 0.289473 seconds.

Of course, you can do even better if you keep reducing the factors of N:

 >> N = length(x1)+length(x2)
 N =
        65538
 >> while max(factor(N)) > 7, N = N + 2; end
 >> N
 N =
        65610
 >> tic; for k = 1:100, ifft(fft( x1, N).*fft(x2,N)); end, toc
 Elapsed time is 0.250967 seconds.

So just by picking a better transform size, you get a 4x speedup, and now even with the ability of conv to optimize for all of those zeros, you'll get better speed.