FFTW single-precision library outputs incorrect DFT when using SIMD optimizations

Question

I'm testing the FFTW 3.3.8 C library for 1D Discrete Fourier Transform (DFT) calculations. The results I get are often incorrect when I use the float (single-precision) version of the libraries, configured with the --enable-generic-simd128 or --enable-generic-simd256 options (and --enable-float for float support). I have tested this in MinGW-w64 as well as Windows Subsystem for Linux, with gcc as the compiler. I also get the same errors when I use the pre-built FFTW package for MinGW-w64, downloaded via pacman in MSYS2.

As a simple test, I'm using an input vector of 1's. The first element of the expected DFT should be equal to the length of the input vector, with all the other elements zero.

Has anyone run into this issue before, or would anyone be willing to try to reproduce it? Am I to make of this that the --enable-generic-simd128 and --enable-generic-simd256 optimizations are not supported when using the float version of the library? My CPU is Intel i7-4720HQ.

Here is a simple test program to demonstrate the issue:

main.c

#include <stdio.h>
#include <fftw3.h>

int main()
{
    fftwf_complex *in, *out;
    fftwf_plan p;
    int N = 21;
    int i;

    in = fftwf_malloc(sizeof(fftwf_complex) * N);
    out = fftwf_malloc(sizeof(fftwf_complex) * N);
    p = fftwf_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);

    for (i = 0; i < N; i++) {
        in[i][0] = 1.0f;
        in[i][1] = 0.0f;
    }
    fftwf_execute(p);

    for (i = 0; i < N; i++) 
        printf("%d: %8.5g\t + j %8.5g\n", i, out[i][0], out[i][1]);

    fftwf_destroy_plan(p);
    fftwf_free(in);
    fftwf_free(out);
}

I build it with gcc -o main main.c -lfftw3f -lm. The output is as follows:

0:       21      + j    -7.87
1:        0      + j        0
2:        0      + j        0
3:  -5.2972      + j      1.9
4:        0      + j        0
5:        0      + j        0
6:   -1.862      + j     1.08
7:        0      + j        0
8:        0      + j        0
9: -0.52584      + j    0.956
10:        0     + j        0
11:        0     + j        0
12:  0.52584     + j    0.956
13:        0     + j        0
14:        0     + j        0
15:    1.862     + j     1.08
16:        0     + j        0
17:        0     + j        0
18:   5.2972     + j      1.9
19:        0     + j        0
20:        0     + j        0

Yes, I just wanted to make sure it's not just something I'm doing wrong or something wrong with my machine first. — Puk
I’m voting to close this question because it seeks to verify a bug before reporting. That's just a duplicate effort – report bugs as soon as you can reproduce them: you don't need third-party confirmation! Nobody would think less of you if the bug was in your code, not theirs. — Marcus Müller
I built FFTW 3.3.8 with --enable-float on a 2016 15-inch MacBook Pro running macOS 10.14.6 with Clang 11.0.0 and Xcode 11.3.1 and built and execute the code in the question. It displayed output of 21 for element 0 and values near 0 (such as 6.6324e-07 + j -2.0458e-07) for the other elements. After I rebuilt FFTW with --enable-generic-simd128, the program gave the output shown in the question. — Eric Postpischil
Thanks @EricPostpischil, I have since also reproduced the issue in native Linux with gcc and Clang, so I think this is likely a bug in FFTW, and I have reported it. — Puk

Eric Postpischil Eric Postpischil · Accepted Answer · 2020-06-13T17:21:49

This appears to be a bug in FFTW 3.3.8.

I built FFTW 3.3.8 with --enable-float on a 2016 15-inch MacBook Pro running macOS 10.14.6 with Clang 11.0.0 and Xcode 11.3.1 and built and executed the code in the question. It displayed output of 21 for element 0 and values near 0 (such as “6.6324e-07 + j -2.0458e-07”) for the other elements. After I rebuilt FFTW with --enable-generic-simd128 added, the program gave the output shown in the question.

FFTW single-precision library outputs incorrect DFT when using SIMD optimizations

2 Answers