I'm analyzing what is essentially a respiratory waveform, constructed in 3 different shapes (the data originates from MRI, so multiple echo times were used; see here if you'd like some quick background). Here are a couple of segments of two of the plotted waveforms for some context:
For each waveform, I'm trying to perform a DFT in order to discover the dominant frequency or frequencies of respiration.
My issue is that when I plot the DFTs that I perform, I get one of two things, dependent on the FFT library that I use. Furthermore, neither of them is representative of what I am expecting. I understand that data doesn't always look the way we want, but I clearly have waveforms present in my data, so I would expect a discrete Fourier transform to produce a frequency peak somewhere reasonable. For reference here, I would expect about 80 to 130 Hz.
My data is stored in a pandas
data frame, with each echo time's data in a separate series. I'm applying the FFT function of choice (see the code below) and receiving different results for each of them.
SciPy (fftpack
)
import pandas as pd
import scipy.fftpack
# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()
# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(scipy.fftpack.fft)
lead_pts_fft_df
NumPy:
import pandas as pd
import numpy as np
# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()
# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(np.fft.fft)
lead_pts_fft_df
The rest of the relevant code:
ECHO_TIMES = [0.080, 0.200, 0.400] # milliseconds
f_s = 1 / (0.006) # 0.006 = time between samples
freq = np.linspace(0, 29556, 29556) * (f_s / 29556) # (29556 = length of data)
# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))
for idx in range(len(ECHO_TIMES)):
axes[idx].plot(freq, lead_pts_fft_df.magnitude[idx].real, # real part
freq, lead_pts_fft_df.magnitude[idx].imag, # imaginary part
axes[idx].legend() # apply legend to each set of axes
# show the plot
plt.show()
Post-DFT (SciPy fftpack
):
Post-DFT (NumPy)
Here is a link to the dataset (on Dropbox) used to create these plots, and here is a link to the Jupyter Notebook.
EDIT:
I used the posted advice and took the magnitude (absolute value) of the data, and also plotted with a logarithmic y-axis. The new results are posted below. It appears that I have some wraparound in my signal. Am I not using the correct frequency scale? The updated code and plots are below.
# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))
for idx in range(len(ECHO_TIMES)):
axes[idx].plot(freq[1::], np.log(np.abs(lead_pts_fft_df.magnitude[idx][1::])),
label=lead_pts_df.index[idx], # apply labels
color=plot_colors[idx]) # colors
axes[idx].legend() # apply legend to each set of axes
# show the plot
plt.show()