
So the cufftComplex type is an array with n structs with an x and a y-field, respectively representing the real and the imaginary parts of each complex number.

On the other hand, if I want to create a vertex buffer object in OpenGL with an x- and y- field, i.e. a 2D vertex or just a vertex buffer object that also represents n complex numbers, I would have to create a 2n sized array of floats with a layout like this:

x0 y0 | x1 y1 | ... | xn yn

I then write it to the VBO by calling:

glBufferData(GL_ARRAY_BUFFER, n * sizeof(GLfloat), complex_values_array, GL_DYNAMIC_DRAW);

I would like to Fourier-transform an image with cuFFT, and display e.g. the magnitude of the complex values. How do I resolve this incompatibility between the two data types? Is there a way for cuFFT to act on VBO's?


Perhaps I should write a CUDA-kernel that takes the cufftComplex type and maps the magnitude of each complex number to a 1D-VBO. Or a CUDA-kernel that maps the cufftComplex type to a 2D-VBO. I do not know what the overhead would be, since it's device-> device I expect it to be manageable.

cufft can stride and scatter both its input and its output, just like FFTW can. But if you want to display the magnitude of the complex values, that is a separate issue. It needs to be calculated. It's not merely an "incompatibility between the two data types".Robert Crovella
Yes but I thought I could feed complex values to the vertex shader and calculate the magnitude there.Jan M.
Probably you could. That's the first mention of your intention to write a shader that I've seen in the question, although I acknowledge its tagged glsl. Anyway its not an incompatibility between datatypes. It's a calculation that has to be performed; cufft won't do that for you. You could do it in a CUDA kernel (and also do any necessary data shuffling there as well) or perhaps in a shader. Anyway, using CUDA-OpenGL Interop, you could map an OpenGL resource to CUDA. CUFFT (or your magnitude/shuffle kernel) could populate that resource, and then you could turn it back over to OpenGL.Robert Crovella
That's ultimately what I did, a CUDA kernel that takes cufftComplex and writes it to an openGL resource. The performance impact wasn't too bad.Jan M.
@PatronBernard: Could you add your solution as an answer please, so that this question falls off the unanswered queue?talonmies

1 Answers


I managed to resolve this issue by writing a kernel as follows:

__global__ void cufftComplex2Float(float* vbo_magnitude, Complex *z, const int width, const int height){
    unsigned int i = blockIdx.x*blockDim.x + threadIdx.x;
    unsigned int j = blockIdx.y*blockDim.y + threadIdx.y;
    float magnitude = pow(z[j*height + i].x, (float)2) + pow(z[j*height + i].y, (float)2);
    vbo_magnitude[j*height + i] = (PI2 + atanf(sqrt(0.1*magnitude))) / PI;

It involves no host-device transfers so it's pretty quick.