Is there a method of FFT that will run inside CUDA Kernel?

Question

I am currently converting a C++ program into CUDA code, and part of my program runs a fast Fourier transform. Originally I ran FFTW, but I saw that I couldn't call it in kernel, so I then rewrote that part using cufft but it tells me the same thing!

Are there any FFT that will run inside a CUDA kernel?

Can I just add __device__ to the fftw library?

I would like to avoid having to initialize or call the FFT in host. I want a completely on the gpu type function, if one exists.

hang hang · Accepted Answer · 2012-07-22T04:09:27

Looks like you are trying to perform several FFTs at once if you are looking to incorporate it into a kernel. I would look into the batch processing features in cuFFT. What is your application? cufftPlanMany() works for batch FFTs in many different memory configurations.

Is there a method of FFT that will run inside CUDA Kernel?

3 Answers