I have an array of uniform signal which were sampled at 10Hz (meaning two consecutive data points is 100 milliseconds apart). This actually the magnitude of the 3 axes of a 3d gyroscope, the array contains 30 data points (in 3 seconds). I plot the frequency of this series as follow
import numpy as np
import matplotlib.pyplot as pl
sample_rate = 10
x = np.array([318.45,302.78,316.47,334.14,333.41,326.15,320.07,318.68,314.12,308.64,300.15,304.33,318.42,322.72,329.56,339.18,338.03,343.27,351.44,353.23,352.35,352.88,353.43,352.14,351.28,352.82,353.36,353.35,353.19,353.82])
x = np.array(x) - np.mean(x)
p = np.abs(np.fft.rfft(x))
f = np.linspace(0, sample_rate/2, len(p))
pl.plot(f, p)
pl.show()
Can someone tell me did I plot right, or not? I am planning to calculate the follow features (from above signal)
- DC Component
- Spectral Energy
- Information Entropy
- Dominant frequency components
- Principal frequency
- Magnitude of the first five components of FFT analysis
Can someone help me to fill the above code for the calculating of those features?
----@RoadRunner66: Please see my questions below as I could not post a long reply to you----
Thank you for you answer and your code,
Regarding to your question, the data is from the Gyro scope which measures the Euler angles.
So (sum x[i]**2 : 3357757.0) is the Spectral Energy? If yes then do I need to normalize it by dividing this number by n? (or multiply with n as you did), however the two below papers have difference in their definitions.
As in the first paper (first link below) they stated that "The second frequency-domain feature set was chosen to be spectral energy, which is defined to be the sum of the squared FFT coefficients"
In the second paper (2nd link) they stated in another way that "Spectral energy: the squared sum of spectral coefficients divided by the number of samples in a window"
And what about the Principal frequency, is that the same meaning (term) with Dominant frequency? I guess Principal frequency refers to the only one which has the highest spectrum peak?
I printed the frequencies and the equivalent magnitudes of into two rows like this
I think you printed the magnitude of the first 5 like the yellow bellow. I am not sure about the definition of "First 5 components"
If we use the first consecutive five like you have pointed out, does it make sense to include the ones (like at frequency 0 or 0.666) and feed them into my prediction model (as explained below), because it is too low compared to the others. If the returning spectrum is clear with dominant frequencies like at 1hz and 3hz then maybe the magnitude at frequency of 0.5hz or 1.5hz will be close to zero.
Could it be the term "Magnitude of the first five components of FFT analysis" is the "Magnitude of the first five dominant components" as I highlighted in the blue colors? Does this term refer to 5 values or just 1 values (square root of the sum of the 5 squares) ? In case it refers to the 5 values (very likely it is) then I think the top five dominant components in magnitude will be a better choice when it comes to comparing the difference between two signal?
Btw, the second paper also wrote "First 5-FFT coefficients: the first 5 of the fast-Fourier transform coefficients are taken since they capture the main frequency components, and the use of additional coefficients did not improve the accuracies"
To be frankly, I am working with the problem of cow activities, my strategy is to segment the sensor data into time windows (3,5,7..seconds) and extract features from each window, then feed them to the Machine Learning model.
(My data include a 3d gyroscope and a 3d accelerometer attached to the neck of the cow, the sensors data sampling is 10Hz)
I want to combine two types of features, one is time domain features and the other one is frequency domain features.
I read paper and found those above set of frequency domain features which includes the term "Magnitude of the first five components of FFT analysis" (from this paper https://ieeexplore.ieee.org/abstract/document/4663615 ) and from this one https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4634510/
The second one they referred it as "First 5-FFT coefficients: the first 5 of the fast-Fourier transform coefficients are taken since they capture the main frequency components, and the use of additional coefficients did not improve the accuracies."
Thank you so much for your reading and answer!