I'm currenty aiming to optimize my fast wavelet transform (FWT) algorithm for 2D signals (images). It works as follows:
- one iteration of 1D FWT does convolution of 1D input data with a selected 1D filter (lengths from 2 to approx. 60) and downsamples the result
- algorithm for 2D transform does 1D FWT across all rows and then all columns of the input image
- it iterates if more levels are desired
The transform is a part of interactive application that demonstrates wavelets and their use. It works fairly fast and usually responds in real time to user's interactions. But if the filter is very long some performance issues occur. I've read that using Fast Fourier Transorm (FFT) instead of convolution is effective for long enough filters.
I've already implemented the 1D FFT, but the question is how to use it for maximum efficiency? Should I transform the input data before single 1D FWT, then perform convolution (which corresponds to multiplication in frequency domain), and then transform the data back using inverse FFT? Also, how is the multiplication done exactly? For example, the input data of length 256 and filter of length 4 are both transformed using FFT and then only the first 4 values of input data are multiplied before transforming the data back? I'm struggling a little bit on the details and would very much appreciate any insight into this.
EDIT: I've figured out that in my case I'm after circular convolution, therefore the filter should be zero padded so that its length is the same as length of the input data. But my question about efficiency still holds. How should I use FFT for FWT computation in order to be beneficial?
EDIT 2: This question was answered on dsp.stackexchange.com.