Fast Fourier Transform in R. What am I doing wrong?

Question

I am a non-expert in Fourier analysis and quite don't get what R's function fft() does. Even after crossreading a lot I couldnt figure it out. I built an example.

require(ggplot2)

freq <- 200  #sample frequency in Hz 
duration <- 3 # length of signal in seconds

#arbitrary sine wave 
x <- seq(-4*pi,4*pi, length.out = freq*duration)
y <- sin(0.25*x) + sin(0.5*x) + sin(x)

which looks like:

enter image description here

fourier <- fft(y)

#frequency "amounts" and associated frequencies

amo <- Mod(fft(y))

freqvec <- 1:length(amo)

I ASSUME that fft expects a vector recorded over a timespan of 1 second, so I divide by the timespan

freqvec <- freqvec/duration 

#and put this into a data.frame

df <- data.frame(freq = freqvec, ammount = amo)

Now I ASSUMABLY can/have to omit the second half of the data.frame since the frequency "amounts" are only significant to half of the sampling rate due to Nyquist.

df <- df[(1:as.integer(0.5*freq*duration)),]

For plotting I discretize a bit

df.disc <- data.frame(freq = 1:100)
cum.amo <- numeric(100)
for (i in 1:100){
  cum.amo[i] <- sum(df$ammount[c(3*i-2,3*i-1,3*i)])
}
df.disc$ammount <- cum.amo

The plot function for the first 20 frequencies:

df.disc$freq <- as.factor(df.disc$freq)

ggplot(df.disc[1:20,], aes(x=freq, y=ammount)) + geom_bar(stat = "identity")

The result:

enter image description here

Is this really a correct spectrogram of the above function? Are my two assumptions correct? Where is my mistake? If there is no, what does this plot now tell me?

EDIT: Here is a picture without discretization:

enter image description here

THANKS to all of you,

Micha.

Since your original sine superposition consists of three components with distinct frequencies and equal amplitude, I would expect the spectogram to consist of three bars of equal amplitude. So I must agree that the outcome seems questionable. — paulroho
This maybe makes my discretization questionable... Very. I add a picture without. — Mika Prouk
Why are you making assumptions about the time-span? Fourier transforms only act on the data provided to them. I also suspect you are not properly applying Nyquist limits, nor have you taken into account the "folding point" of the output of a Fourier transform. Rather than asking about what R functions are doing, perhaps you should start by reading some detailed discussions of what the Real and Imaginary parts of a FFT represent (and why there are peaks at positive and negative frequencies, for example). — Carl Witthoft
As you suspect the FFT part is a small part of a bigger problem I want to solve. So I thought I could avoid diving too deep into the mathematical details of all the subproblems. However in this case theres probably no escape. Still... if I apply fft() to a vector i get as many complex points as points my vector had and there must be a way to get the time component back. So my question: fft() assumes the vector to be 1 second "long", no? — Mika Prouk

Mika Prouk Mika Prouk · Accepted Answer · 2014-12-07T22:26:29

Okay, okay. Due to the generally inferior nature of my mistake the solution is quite trivial. I wrote freq = 200 and duration = 3. But the real duration is from -4pi to 4 pi, hence 8pi resulting in a "real" sample frequency of 1/ ((8*pi)/600) = 23.87324 which does not equal 200. Replacing the respective lines in the example code by

freq <- 200  #sample frequency in Hz
duration <- 6 # length of signal in seconds
x <- seq(0,duration, length.out = freq*duration) 
y <- sin(4*pi*x) + sin(6*pi*x) + sin(8*pi*x)

(with a more illustrative function) yields the correct frequencies as demonstrated by the following plot (restricted to the important part of the frequency domain):

enter image description here

Fast Fourier Transform in R. What am I doing wrong?

1 Answers