I have a real 3D array of dimensions Nx*Ny*Nz and want to take a 2D Fourier transform for each z value using FFTW. Here the z index is the fastest varying in memory. Currently the following code works as expected:
int Nx = 16; int Ny = 8; int Nz = 3;
// allocate memory
const int dims = Nx * Ny * Nz;
// input data (pre Fourier transform)
double *input = fftw_alloc_real(dims);
// why is this the required output size?
const int outdims = Nx * (Ny/2 + 1) * Nz;
// we want to perform the transform out of place
// (so seperate array for output)
fftw_complex *output = fftw_alloc_complex(outdims);
// setup "plans" for forward and backward transforms
const int rank = 2; const int howmany = Nz;
const int istride = Nz; const int ostride = Nz;
const int idist = 1; const int odist = 1;
int n[] = {Nx, Ny};
int *inembed = NULL, *onembed = NULL;
fftw_plan fp = fftw_plan_many_dft_r2c(rank, n, howmany,
input, inembed, istride, idist,
output, onembed, ostride, odist,
FFTW_PATIENT);
fftw_plan bp = fftw_plan_many_dft_c2r(rank, n, howmany,
output, onembed, ostride, odist,
input, inembed, istride, idist,
FFTW_PATIENT);
As I understand it, transforming a 1D sequence of length N requires (N/2 + 1) complex values so why does the above code crash if instead I set outdims = (Nx/2 + 1)*(Ny/2 + 1)*Nz
as one might expect for a 2D transform?
Secondly am I right in thinking that I can access the real and imaginary parts of modes from qx = 0 to Nx/2
(inclusive) using the following:
#define outputRe(qx,qy,d) ( output[(d) + Nz * ((qy) + (Ny/2 + 1) * (qx))][0] )
#define outputIm(qx,qy,d) ( output[(d) + Nz * ((qy) + (Ny/2 + 1) * (qx))][1] )
EDIT: Full code and Makefile for those who want to play around. Assumes fftw and gsl installed.
EDIT2: If I understand correctly, the indexing (allowing positive and negative frequencies) should be (probably getting too messy for a macro!):
#define outputRe(qx,qy,d) ( output[(d) + Nz * ((qy) + (Ny/2 + 1) * ( ((qx) >= 0) ? (qx) : (Nx + (qx)) ) ) ][0] )
#define outputIm(qx,qy,d) ( output[(d) + Nz * ((qy) + (Ny/2 + 1) * ( ((qx) >= 0) ? (qx) : (Nx + (qx)) ) ) ][1] )
for (int qx = -Nx/2; qx < Nx/2; ++qx)
for (int qy = 0; qy <= Ny/2; ++qy)
outputRe(qx, qy, d) = ...
where outputRe(-Nx/2, qy, d)
points to the same data as outputRe(Nx/2, qy, d)
. In practice I would probably just to loop over the first index and convert to a frequency, rather than the other way round!