I write this additional answer to explain the origins of the diffusion of the spikes when using FFT and especially discuss the scipy.fftpack tutorial with which I disagree at some point.
In this example, the recording time tmax=N*T=0.75
. The signal is sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)
. The frequency signal should contain two spikes at frequencies 50
and 80
with amplitudes 1
and 0.5
. However, if the analysed signal does not have a integer number of periods diffusion can appear due to the truncation of the signal:
- Pike 1:
50*tmax=37.5
=> frequency 50
is not a multiple of 1/tmax
=> Presence of diffusion due to signal truncation at this frequency.
- Pike 2:
80*tmax=60
=> frequency 80
is a multiple of 1/tmax
=> No diffusion due to signal truncation at this frequency.
Here is a code that analyses the same signal as in the tutorial (sin(50*2*pi*x) + 0.5*sin(80*2*pi*x)
), but with the slight differences:
- The original scipy.fftpack example.
- The original scipy.fftpack example with an integer number of signal periods (
tmax=1.0
instead of 0.75
to avoid truncation diffusion).
- The original scipy.fftpack example with an integer number of signal periods and where the dates and frequencies are taken from the FFT theory.
The code:
import numpy as np
import matplotlib.pyplot as plt
import scipy.fftpack
# 1. Linspace
N = 600
# Sample spacing
tmax = 3/4
T = tmax / N # =1.0 / 800.0
x1 = np.linspace(0.0, N*T, N)
y1 = np.sin(50.0 * 2.0*np.pi*x1) + 0.5*np.sin(80.0 * 2.0*np.pi*x1)
yf1 = scipy.fftpack.fft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*T), N//2)
# 2. Integer number of periods
tmax = 1
T = tmax / N # Sample spacing
x2 = np.linspace(0.0, N*T, N)
y2 = np.sin(50.0 * 2.0*np.pi*x2) + 0.5*np.sin(80.0 * 2.0*np.pi*x2)
yf2 = scipy.fftpack.fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*T), N//2)
# 3. Correct positioning of dates relatively to FFT theory ('arange' instead of 'linspace')
tmax = 1
T = tmax / N # Sample spacing
x3 = T * np.arange(N)
y3 = np.sin(50.0 * 2.0*np.pi*x3) + 0.5*np.sin(80.0 * 2.0*np.pi*x3)
yf3 = scipy.fftpack.fft(y3)
xf3 = 1/(N*T) * np.arange(N)[:N//2]
fig, ax = plt.subplots()
# Plotting only the left part of the spectrum to not show aliasing
ax.plot(xf1, 2.0/N * np.abs(yf1[:N//2]), label='fftpack tutorial')
ax.plot(xf2, 2.0/N * np.abs(yf2[:N//2]), label='Integer number of periods')
ax.plot(xf3, 2.0/N * np.abs(yf3[:N//2]), label='Correct positioning of dates')
plt.legend()
plt.grid()
plt.show()
Output:
As it can be here, even with using an integer number of periods some diffusion still remains. This behaviour is due to a bad positioning of dates and frequencies in the scipy.fftpack tutorial. Hence, in the theory of discrete Fourier transforms:
- the signal should be evaluated at dates
t=0,T,...,(N-1)*T
where T is the sampling period and the total duration of the signal is tmax=N*T
. Note that we stop at tmax-T
.
- the associated frequencies are
f=0,df,...,(N-1)*df
where df=1/tmax=1/(N*T)
is the sampling frequency. All harmonics of the signal should be multiple of the sampling frequency to avoid diffusion.
In the example above, you can see that the use of arange
instead of linspace
enables to avoid additional diffusion in the frequency spectrum. Moreover, using the linspace
version also leads to an offset of the spikes that are located at slightly higher frequencies than what they should be as it can be seen in the first picture where the spikes are a little bit at the right of the frequencies 50
and 80
.
I'll just conclude that the example of usage should be replace by the following code (which is less misleading in my opinion):
import numpy as np
from scipy.fftpack import fft
# Number of sample points
N = 600
T = 1.0 / 800.0
x = T*np.arange(N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y)
xf = 1/(N*T)*np.arange(N//2)
import matplotlib.pyplot as plt
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid()
plt.show()
Output (the second spike is not diffused anymore):
I think this answer still bring some additional explanations on how to apply correctly discrete Fourier transform. Obviously, my answer is too long and there is always additional things to say (ewerlopes talked briefly about aliasing for instance and a lot can be said about windowing), so I'll stop.
I think that it is very important to understand deeply the principles of discrete Fourier transform when applying it because we all know so much people adding factors here and there when applying it in order to obtain what they want.