I'm testing the FFTW 3.3.8 C library for 1D Discrete Fourier Transform (DFT) calculations. The results I get are often incorrect when I use the float
(single-precision) version of the libraries, configured with the --enable-generic-simd128
or --enable-generic-simd256
options (and --enable-float
for float
support). I have tested this in MinGW-w64 as well as Windows Subsystem for Linux, with gcc as the compiler. I also get the same errors when I use the pre-built FFTW package for MinGW-w64, downloaded via pacman in MSYS2.
As a simple test, I'm using an input vector of 1's. The first element of the expected DFT should be equal to the length of the input vector, with all the other elements zero.
Has anyone run into this issue before, or would anyone be willing to try to reproduce it? Am I to make of this that the --enable-generic-simd128
and --enable-generic-simd256
optimizations are not supported when using the float
version of the library? My CPU is Intel i7-4720HQ.
Here is a simple test program to demonstrate the issue:
main.c
#include <stdio.h>
#include <fftw3.h>
int main()
{
fftwf_complex *in, *out;
fftwf_plan p;
int N = 21;
int i;
in = fftwf_malloc(sizeof(fftwf_complex) * N);
out = fftwf_malloc(sizeof(fftwf_complex) * N);
p = fftwf_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
for (i = 0; i < N; i++) {
in[i][0] = 1.0f;
in[i][1] = 0.0f;
}
fftwf_execute(p);
for (i = 0; i < N; i++)
printf("%d: %8.5g\t + j %8.5g\n", i, out[i][0], out[i][1]);
fftwf_destroy_plan(p);
fftwf_free(in);
fftwf_free(out);
}
I build it with gcc -o main main.c -lfftw3f -lm
. The output is as follows:
0: 21 + j -7.87
1: 0 + j 0
2: 0 + j 0
3: -5.2972 + j 1.9
4: 0 + j 0
5: 0 + j 0
6: -1.862 + j 1.08
7: 0 + j 0
8: 0 + j 0
9: -0.52584 + j 0.956
10: 0 + j 0
11: 0 + j 0
12: 0.52584 + j 0.956
13: 0 + j 0
14: 0 + j 0
15: 1.862 + j 1.08
16: 0 + j 0
17: 0 + j 0
18: 5.2972 + j 1.9
19: 0 + j 0
20: 0 + j 0
--enable-float
on a 2016 15-inch MacBook Pro running macOS 10.14.6 with Clang 11.0.0 and Xcode 11.3.1 and built and execute the code in the question. It displayed output of 21 for element 0 and values near 0 (such as 6.6324e-07 + j -2.0458e-07) for the other elements. After I rebuilt FFTW with--enable-generic-simd128
, the program gave the output shown in the question. – Eric Postpischil