3
votes

I am doing a 1D FFT. I have the same input data as would go in FFTW, however, the return from CUFFT does not seem to be "aligned" the same was FFTW is. That is, In my FFTW code, I could calculate the center of the zero padding, then do some shifting to "left-align" all my data, and have trailing zeros.

In CUFFT, the result from the FFT is data that looks like it is the same, however, the zeros are not "centered" in the output, so the rest of my algorithm breaks. (The shifting to left-align the data still has a "gap" in it after the bad shift).

Can anyone give me any insight? I thought it had something to do with those compatibility flags, but even with cufftSetCompatibilityMode(plan, CUFFT_COMPATIBILITY_FFTW_ALL); I am still getting a bad result.

Below is a plot of the magnitude of the data from the first row. The data on the left is the output of the inverse CUFFT, and the output on the right is the output of the inverse FFTW.

Thanks!enter image description here

Here is the setup code for the FFTW and CUFFT plans

ifft = fftwf_plan_dft_1d(freqCols, reinterpret_cast<fftwf_complex*>(indata), 

                  reinterpret_cast<fftwf_complex*>(outdata), 

                  FFTW_BACKWARD, FFTW_ESTIMATE);

CUFFT:

cufftSetCompatibilityMode(plan, CUFFT_COMPATIBILITY_FFTW_ALL);
cufftPlan1d(&plan, width, CUFFT_C2C, height);

and executing code:

fftwf_execute(ifft);

CUFFT:

cufftExecC2C(plan, d_image, d_image, CUFFT_INVERSE); //in place inverse

Completed some test code:

complex<float> *input = (complex<float>*)fftwf_malloc(sizeof(fftwf_complex) * 100);
    complex<float> *output = (complex<float>*)fftwf_malloc(sizeof(fftwf_complex) * 100);

    fftwf_plan ifft;
    ifft = fftwf_plan_dft_1d(100, reinterpret_cast<fftwf_complex*>(input), 

                          reinterpret_cast<fftwf_complex*>(output), 

                          FFTW_BACKWARD, FFTW_ESTIMATE);


    cufftComplex *inplace = (cufftComplex *)malloc(100*sizeof(cufftComplex));
    cufftComplex *d_inplace;
    cudaMalloc((void **)&d_inplace,100*sizeof(cufftComplex));
    for(int i = 0; i < 100; i++)
    {
        inplace[i] = make_cuComplex(cos(.5*M_PI*i),sin(.5*M_PI*i));
        input[i] = complex<float>(cos(.5*M_PI*i),sin(.5*M_PI*i));
    }

    cutilSafeCall(cudaMemcpy(d_inplace, inplace, 100*sizeof(cufftComplex), cudaMemcpyHostToDevice));
    cufftHandle plan;
    cufftPlan1d(&plan, 100, CUFFT_C2C, 1);
    cufftExecC2C(plan, d_inplace, d_inplace, CUFFT_INVERSE);
    cutilSafeCall(cudaMemcpy(inplace, d_inplace, 100*sizeof(cufftComplex), cudaMemcpyDeviceToHost));


    fftwf_execute(ifft);

When I dumped the output from both of these FFT calls, it did look the same. I am not exactly sure what I was looking at though. The data had a value of 100 in the 75th row. Is that correct?

1
Can you post some of the data from each, e.g. for the first few bins ?Paul R
Added a screenshot of the resulting magnitude plot of the first row, which illustrates what I am talking aboutDerek
Are you doing a forward FFT prior to this inverse FFT or did you start out in the frequency domain ? Does the frequency domain data match ?Paul R
The data starts out as complex data. They do match exactly, in both the FFTW and CUFFT versionsDerek
OK - is this a complex-to-real IFFT or complex-to-complex ? You might want to double check how both FFTW and CUFFT expect the frequency domain data to be ordered, particularly the 0 and N/2 bins. It looks like you just have a shift in the time domain result but I can't quite guess how this might be happening...Paul R

1 Answers

2
votes

It looks like you may have swapped the real and imaginary components of your complex data in the input to one of the IFFTs. This swap will change an even function to an odd function in the time domain.