Image Gaussian convolution in Fourier domain: works, while should't

Question

The problem is I can't fully understand the principles of convolution in frequency domain.

I have an image of size 256x256, which I want to convolve with 3x3 gaussian matrix. It's coefficients are (1/16, 1/8, 1/4):

PlainImage<float> FourierRunner::getGaussMask(int sz)
{
    PlainImage<float> G(3,3);
    *G.at(0, 0) = 1.0/16; *G.at(0, 1) = 1.0/8; *G.at(0, 2) = 1.0/16;
    *G.at(1, 0) = 1.0/8; *G.at(1, 1) = 1.0/4; *G.at(1, 2) = 1.0/8;
    *G.at(2, 0) = 1.0/16; *G.at(2, 1) = 1.0/8; *G.at(2, 2) = 1.0/16;
    return G;
}

To get FFT of both image and filter kernel, I zero-pad them. sz_common stands for the extended size. Image and kernel are moved to the center of h and g ComplexImages respectively, so they are zero-padded at right, left, bottom and top.

I've read that size should be sz_common >= sz+gsz-1 because of circular convolution property: filter can change undesired image values on boundaries.
But it don't works: adequate results are only when sz_common = sz, when sz_common = sz+gsz-1 or sz_common = 2*sz, after IFFT I get 2-3 times smaller convolved image! Why?

Also I'm confused that filter matrix values should be multiplied by 256, like pixel values: other questions on SO contain Matlab code without such normalization. As in previous case, without such multiplying it works bad: I get black image. Why?

// fft_in is shifted fourier image with center in [sz/2;sz/2]

void FourierRunner::convolveImage(ComplexImage& fft_in)
{
    int sz = 256; // equal to fft_in.width()

    // Get original complex image (backward fft_in)
    ComplexImage original_complex = fft_in;
    fft2d_backward(fft_in, original_complex);

    int gsz = 3;
    PlainImage<float> filter = getGaussMask(gsz);
    ComplexImage filter_complex = ComplexImage::fromFloat(filter);

    int sz_common = pow2ceil(sz); // should be sz+gsz-1 ???

    ComplexImage h = ComplexImage::zeros(sz_common,sz_common);
    ComplexImage g = ComplexImage::zeros(sz_common,sz_common);

    copyImageToCenter(h, original_complex);
    copyImageToCenter(g, filter_complex);

    LOOP_2D(sz_common, sz_common) g.setPoint(x, y, g.at(x, y)*256);

    fft2d_forward(g, g);
    fft2d_forward(h, h);
    fft2d_fft_shift(g);

    // CONVOLVE
    LOOP_2D(sz_common,sz_common) h.setPoint(x, y, h.at(x, y)*g.at(x, y));

    copyImageToCenter(fft_in, h);
    fft2d_backward(fft_in, fft_in);
    fft2d_fft_shift(fft_in);

    // TEST DIFFERENCE BTW DOMAINS
    PlainImage<float> frequency_res(sz,sz);
    writeComplexToPlainImage(fft_in, frequency_res);

    fft2d_forward(fft_in, fft_in);
}

I tried to zero-padd image at right and bottom, such that smaller image is copied to the start of bigger, but it also doesn't work.
I wrote convolution in spatial domain to compare results, frequency blur results are almost the same as in spatial domain (avg. error btw pixels is 5), only when sz_common = sz.

So, could you explain phenomena of zero-padding and normalization for this case? Thanks in advance.

Royi Royi · Accepted Answer · 2015-09-23T11:13:10

Convolution in the Spatial Domain is equivalent of Multiplication in the Fourier Domain.

This is the truth for Continuous functions which are defined everywhere.
Yet in practice, we have discrete signals and convolution kernels.
Which require more gentle caring.

If you have an image of the size M x N and a Kernel of the size of MM x NN if you apply DFT (FFT is an efficient way to calculate the DFT) on them you'll get functions of the size of M x N and MM x NN respectively.
Moreover, the theorem above, about the multiplication equivalence requires to multiply the same frequencies one with each other.

Since practically the Kernel is much smaller than the image, usually it is zero padded to the size of the image.
Now, by applying the DFT you'll get to matrices of the same M x N size and will be able to multiply them.
Yet, this will be equivalent of the Circular Convolution between the Image and Kernel.

To apply the linear convolution you should make them both in the size of (M + MM - 1) x (N + NN - 1).
Usually this would be by applying "Replicate" boundary condition on the image and zero pad the Kernel.

Enjoy...

P.S. Could you support a new Community Proposal for SE at - http://area51.stackexchange.com/proposals/86832/.
We need more people to follow, up vote questions with less than 10 up votes and more question to be asked.

Thank You.

Image Gaussian convolution in Fourier domain: works, while should't

1 Answers