Why does FFT produce complex numbers instead of real numbers?

86

votes

All the FFT implementations we have come across result in complex values (with real and imaginary parts), even if the input to the algorithm was a discrete set of real numbers (integers).

Is it not possible to represent frequency domain in terms of real numbers only?

algorithmmathaudiosignal-processingfft

90

votes

The FFT is fundamentally a change of basis. The basis into which the FFT changes your original signal is a set of sine waves instead. In order for that basis to describe all the possible inputs it needs to be able to represent phase as well as amplitude; the phase is represented using complex numbers.

For example, suppose you FFT a signal containing only a single sine wave. Depending on phase you might well get an entirely real FFT result. But if you shift the phase of your input a few degrees, how else can the FFT output represent that input?

edit: This is a somewhat loose explanation, but I'm just trying to motivate the intuition.

56

votes

The FFT provides you with amplitude and phase. The amplitude is encoded as the magnitude of the complex number (sqrt(x^2+y^2)) while the phase is encoded as the angle (atan2(y,x)). To have a strictly real result from the FFT, the incoming signal must have even symmetry (i.e. x[n]=conj(x[N-n])).

If all you care about is intensity, the magnitude of the complex number is sufficient for analysis.

42

votes

Yes, it is possible to represent the FFT frequency domain results of strictly real input using only real numbers.

Those complex numbers in the FFT result are simply just 2 real numbers, which are both required to give you the 2D coordinates of a result vector that has both a length and a direction angle (or magnitude and a phase). And every frequency component in the FFT result can have a unique amplitude and a unique phase (relative to some point in the FFT aperture).

One real number alone can't represent both magnitude and phase. If you throw away the phase information, that could easily massively distort the signal if you try to recreate it using an iFFT (and the signal isn't symmetric). So a complete FFT result requires 2 real numbers per FFT bin. These 2 real numbers are bundled together in some FFTs in a complex data type by common convention, but the FFT result could easily (and some FFTs do) just produce 2 real vectors (one for cosine coordinates and one for sine coordinates).

There are also FFT routines that produce magnitude and phase directly, but they run more slowly than FFTs that produces a complex (or two real) vector result. There also exist FFT routines that compute only the magnitude and just throw away the phase information, but they usually run no faster than letting you do that yourself after a more general FFT. Maybe they save a coder a few lines of code at the cost of not being invertible. But a lot of libraries don't bother to include these slower and less general forms of FFT, and just let the coder convert or ignore what they need or don't need.

Plus, many consider the math involved to be a lot more elegant using complex arithmetic (where, for strictly real input, the cosine correlation or even component of an FFT result is put in the real component, and the sine correlation or odd component of the FFT result is put in the imaginary component of a complex number.)

(Added:) And, as yet another option, you can consider the two components of each FFT result bin, instead of as real and imaginary components, as even and odd components, both real.

21

votes

If your FFT coefficient for a given frequency f is x + i y, you can look at x as the coefficient of a cosine at that frequency, while the y is the coefficient of the sine. If you add these two waves for a particular frequency, you will get a phase-shifted wave at that frequency; the magnitude of this wave is sqrt(x*x + y*y), equal to the magnitude of the complex coefficient.

The Discrete Cosine Transform (DCT) is a relative of the Fourier transform which yields all real coefficients. A two-dimensional DCT is used by many image/video compression algorithms.

9

votes

The discrete Fourier transform is fundamentally a transformation from a vector of complex numbers in the "time domain" to a vector of complex numbers in the "frequency domain" (I use quotes because if you apply the right scaling factors, the DFT is its own inverse). If your inputs are real, then you can perform two DFTs at once: Take the input vectors x and y and calculate F(x + i y). I forget how you separate the DFT afterwards, but I suspect it's something about symmetry and complex conjugates.
The discrete cosine transform sort-of lets you represent the "frequency domain" with the reals, and is common in lossy compression algorithms (JPEG, MP3). The surprising thing (to me) is that it works even though it appears to discard phase information, but this also seems to make it less useful for most signal processing purposes (I'm not aware of an easy way to do convolution/correlation with a DCT).

I've probably gotten some details wrong ;)

2

votes

The way you've phrased this question, I believe you are looking for a more intuitive way of thinking rather than a mathematical answer. I come from a mechanical engineering background and this is how I think about the Fourier transform. I contextualize the Fourier transform with reference to a pendulum. If we have only the x-velocity vs time of a pendulum and we are asked to estimate the energy of the pendulum (or the forcing source of the pendulum), the Fourier transform gives a complete answer. As usually what we are observing is only the x-velocity, we might conclude that the pendulum only needs to be provided energy equivalent to its sinusoidal variation of kinetic energy. But the pendulum also has potential energy. This energy is 90 degrees out of phase with the potential energy. So to keep track of the potential energy, we are simply keeping track of the 90 degree out of phase part of the (kinetic)real component. The imaginary part may be thought of as a 'potential velocity' that represents a manifestation of the potential energy that the source must provide to force the oscillatory behaviour. What is helpful is that this can be easily extended to the electrical context where capacitors and inductors also store the energy in 'potential form'. If the signal is not sinusoidal of course the transform is trying to decompose it into sinusoids. This I see as assuming that the final signal was generated by combined action of infinite sources each with a distinct sinusoid behaviour. What we are trying to determine is a strength and phase of each source that creates the final observed signal at each time instant.

PS: 1) The last two statements is generally how I think of the Fourier transform itself. 2) I say potential velocity rather the potential energy as the transform usually does not change dimensions of the original signal or physical quantity so it cannot shift from representing velocity to energy.

Why does FFT produce complex numbers instead of real numbers?

6 Answers