4
votes

I understand that the magnitude and phase are captured in the real and imaginary parts in the result of an fft. But how does each sample capture phase?

Is the phase related to the N discrete samples provided in time domain?

That is, if the input sample included 44100 samples for a second, then is each resulting value of the FFT represent 1/44100 of the phase?

For example, the first FFT value is at frequency 1/44100 and the second value is 2/44100 and so on?

4
You have to be more specific about what you want the phase of. e.g. What phase?hotpaw2

4 Answers

4
votes

i think you are saying "phase" when you mean "frequency" in some parts of your question?

anyway, if you are asking about frequency, it works pretty much like time in the "input" data. you start with time series data, where each array element is at a different time. after the fft the "output" is similar, but each element is a different frequency.

they range from the constant offset to the highest frequency possible, in uniform steps, but the actual order may depend on the implementation you are using. so each complex number represents amplitude and phase at one particular frequency - you can work out the frequency from the position in the output array.

if you have N points that cover a time T then the highest frequency is N/(2T) and the values are multiples of 1/T (including 0Hz - a constant offset). for example, 60 samples over 1 minute (N=60 T=60s) gives a top frequency of 0.5Hz. there are no higher frequencies because the data are not sampled well enough to pick them out clearly (a 1 Hz signal, for example, could be at its maximum on each sample and so would appear as a constant signal). this limit is called the nyquist frequency

(the above assumes the output is an array of complex numbers; often it's an array of floats/doubles and you need to piece together the complex numbers from real and imaginary values in different parts of the array - it all gets a bit messy, but the concept is the same as if you were returned an array of complex values).

ps typically when i have to use an fft routine from somewhere i make some data that have a constant offset and two known frequency sine waves, then fft that and look at the results. if you make the amplitudes of each component different then it's usually obvious how things are ordered. you can also check the scale, because sometimes that has/omits a factor of 2pi...

5
votes

The output of an FFT simply expresses how you can reconstruct the original waveform from the sum of harmonically-related sinusoidal components.

Each output value expresses the amplitude and phase (i.e. offset angle) of the corresponding component. It's important to note that each component is a complex sinusoid (something of the form A * exp(j * 2pi * f * n + phi), not A * cos(2pi * f * n + phi)).

The frequency is implicit in the index of the output sample; if your sample rate if fs (in Hz) and you have a length-N FFT, then the centre frequency corresponding to output sample i is i*fs/N (in Hz).

4
votes

The phase is related to the shift in time of the periodic signal component in the input samples.

Here's how to see this...

First, recall that Fast FT is exactly the same thing as Discrete FT, only computed in a more efficient way. So, getting back to the basics we have the transform defined as:

Xk(0<=k<=N-1)= sumfor 0<=n<=N-1(xn * e-j * 2*π * n * k/N)

where:
xn are the input samples
Xk are the output/transformed samples
N is the number of samples

Now, this complex exponent, e-j * 2*π * n * k/N, geometrically represents points on a circle (of radius 1, centered at (0,0)) in the Re/Im plane. See the Euler's formula if you've forgotten this.

For a fixed value of k (representing a specific frequency of interest in the output/transform) there are no more than N/k distinct points on this circle for all n's.

Look at the sum in the formula again:

sumfor 0<=n<=N-1(xn * e-j * 2*π * n * k/N)

In this sum you are scaling the vectors from point (0,0) to the aforementioned points on the circle by the input signal xn. You're making these vectors longer or shorter. And then you're adding them up.

If it so happens that xn contains a periodic signal that has a period of N/k, then all the maximums of that signal will all align at one point on the circle and sort of amplify each other. Minimums and all other values of the signal contribute too.

Simply put, what you're doing here is winding your input xn onto the circle. If there's a periodic component in the signal and its period matches the "circumference" (=number of points on the circle), you get a peak for that period/frequency because of the aligned maximums and minimums. If the period doesn't match the "circumference", the maximums get all over the place and cancel each other out. And this is the essence of the Fourier Transform, this is how and why it works, no magic, no truly complex math, simple winding of a rope onto a reel.

And the phase that you get in Xk simply indicates the point on the circle where all the maximums aligned. If you shift the periodic signal in xn by a sample or a few, the alignment point will shift too and the phase will change appropriately.

That's the geometric explanation.

Now, you can see this same thing as a mathematical property of the Fourier Transform.

If you have your xn and its transform Xk=F{xn}, then the transform of xn-m will be F{xn-m} = F{xn} * e-j * 2*π * k * m/N = Xk * e-j * 2*π * k * m/N. This is called the shift theorem/property. You should be able to derive this trivially. This factor of e-j * 2*π * k * m/N has a magnitude of 1 and only changes the phase when multiplied by Xk.

And the phase has nothing to do with the frequency.

Also, the maximum frequency of your sampled signal xn is half the sample rate (actually, just a tiny bit less than the half, see the Nyquist sampling theorem). That means the FT your case will never give you anything at or above 22050 Hz because all information at higher frequencies has been lost to the sampling.

And a half of the Xk values will give you components with negative frequencies. That's because when k > N/2 the direction in which you move between the points on the circle reverses. So, the maximum frequency is still less than half the sample rate, despite having so many samples in the output/transform.

1
votes

The frequency of an FFT result isn't captured by a complex number in the result vector. A frequency multiplier is captured by the index of each array element containing the complex number. Then you take the index and multiply by a frequency scale factor, which is related to the sample rate of the time domain samples as well as the inverse of the length of the FFT, to get the center frequency of each FFT bin.

Each frequency sinusoid represented by each FFT result vector element will have it's own independent phase, not shared with any other bin or array element.

The frequency will be unknown if you don't know the length of the FFT. So the answer to the last part of your question could be either Unknown or No.