2
votes

I'm analyzing what is essentially a respiratory waveform, constructed in 3 different shapes (the data originates from MRI, so multiple echo times were used; see here if you'd like some quick background). Here are a couple of segments of two of the plotted waveforms for some context:

enter image description here

For each waveform, I'm trying to perform a DFT in order to discover the dominant frequency or frequencies of respiration.

My issue is that when I plot the DFTs that I perform, I get one of two things, dependent on the FFT library that I use. Furthermore, neither of them is representative of what I am expecting. I understand that data doesn't always look the way we want, but I clearly have waveforms present in my data, so I would expect a discrete Fourier transform to produce a frequency peak somewhere reasonable. For reference here, I would expect about 80 to 130 Hz.

My data is stored in a pandas data frame, with each echo time's data in a separate series. I'm applying the FFT function of choice (see the code below) and receiving different results for each of them.

SciPy (fftpack)

import pandas as pd
import scipy.fftpack

# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()

# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(scipy.fftpack.fft)
lead_pts_fft_df

NumPy:

import pandas as pd
import numpy as np 

# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()

# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(np.fft.fft)
lead_pts_fft_df

The rest of the relevant code:

ECHO_TIMES = [0.080, 0.200, 0.400] # milliseconds

f_s = 1 / (0.006) # 0.006 = time between samples
freq = np.linspace(0, 29556, 29556) * (f_s / 29556) # (29556 = length of data)

# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))

for idx in range(len(ECHO_TIMES)):
    axes[idx].plot(freq, lead_pts_fft_df.magnitude[idx].real, # real part
                   freq, lead_pts_fft_df.magnitude[idx].imag, # imaginary part

    axes[idx].legend() # apply legend to each set of axes

# show the plot
plt.show()

Post-DFT (SciPy fftpack):

enter image description here

Post-DFT (NumPy)

enter image description here

Here is a link to the dataset (on Dropbox) used to create these plots, and here is a link to the Jupyter Notebook.

EDIT:

I used the posted advice and took the magnitude (absolute value) of the data, and also plotted with a logarithmic y-axis. The new results are posted below. It appears that I have some wraparound in my signal. Am I not using the correct frequency scale? The updated code and plots are below.

# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))

for idx in range(len(ECHO_TIMES)):
    axes[idx].plot(freq[1::], np.log(np.abs(lead_pts_fft_df.magnitude[idx][1::])),
                   label=lead_pts_df.index[idx], # apply labels
                   color=plot_colors[idx]) # colors
    axes[idx].legend() # apply legend to each set of axes

# show the plot
plt.show()

enter image description here

1
Actually caught that just before you posted. Fixed in the edit. Thank you!questionable_code
What is the mean value of the input series? The plots show values that are around 0.99, but the vertical axis label says "normalized". Are plots representative of the actual data being passed to the FFT functions?Warren Weckesser
With a Fourier transform, you generally not interested in the real and imaginary parts (unless you actually need phase information). So you should consider the magnitude of the complex values. This is the "energy" at a given frequency, and that will help you to determine which frequencies comtribute most to the signal. The other thing is that the signals have a mean value, so the zero frequency, i.e. constant, is significant and dominates the contribution from the other frequencies. Either subtract of the mean of the signal or use a logarithmic y-scale.jdowner
side question ... does that link you refer to mriquestions.com supply sample dataset files similar to the one you plot above ?Scott Stensland
@ScottStensland likely no; my dataset is from a proprietary MR scanner. I can provide it to you, if that is helpfulquestionable_code

1 Answers

1
votes

I've figured out my issues.

Cris Luengo was very helpful with this link, which helped me determine how to correctly scale my frequency bins and plot the DFT appropriately.

ADDITIONALLY: I had forgotten to take into account the offset present in the signal. Not only does it cause issues with the huge peak at 0 Hz in the DFT, but it is also responsible for most of the noise in the transformed signal. I made use of scipy.signal.detrend() to eliminate this and got a very appropriate looking DFT.

# import DFT and signal packages from SciPy
import scipy.fftpack
import scipy.signal

# temporary copy to maintain data structure; original data frame is NOT changed due to ".copy()"
lead_pts_fft_df = lead_pts_df.copy()

# apply a discrete fast Fourier transform to each data series in the data frame AND detrend the signal
lead_pts_fft_df.magnitude = lead_pts_fft_df.magnitude.apply(scipy.signal.detrend)
lead_pts_fft_df.magnitude = lead_pts_fft_df.magnitude.apply(np.fft.fft)
lead_pts_fft_df

Arrange frequency bins accordingly:

num_projections = 29556
N = num_projections
T = (6 * 3) / 1000 # 6*3 b/c of the nature of my signal: 1 pt for each waveform collected every third acquisition
xf = np.linspace(0.0, 1.0 / (2.0 * T), num_projections / 2)

Then plot:

# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))

for idx in range(len(ECHO_TIMES)):
    axes[idx].plot(xf, 2.0/num_projections * np.abs(lead_pts_fft_df.magnitude[idx][:num_projections//2]),
                   label=lead_pts_df.index[idx], # apply labels
                   color=plot_colors[idx]) # colors
    axes[idx].legend() # apply legend to each set of axes

# show the plot
plt.show()

enter image description here