1024 pt fft on a large set of data points

Question

I have a signal that may not be periodic. We take about 15secs worth of samples (@ 10kHz sampling rate) and we need to do the FFT on that signal to get the frequency content.

The problem is that we are implementing this FFT on an embedded system (DSP) which provides a library FFT of max. 1024 length. That is, it takes in 1024 data points as input and provides a 1024 point output.

What is the correct way of obtaining an FFT on the full 150000 point input?

dpwe dpwe · Accepted Answer · 2014-06-08T19:27:50

You could run the FFT on each 1024 point block and average them to get an average power spectrum on the lower-resolution 1024-point frequency axis (512 samples from 0 to the Nyquist frequency, fs/2, so about 10 Hz resolution for your 10 kHz sampling). You should average the magnitudes of the component FFTs (i.e., sqrt(re^2+im^2)), otherwise the average will be sensitive to the drifting phase within each subwindow, which will depend on the precise frequency of the sinusoi.

If you think the periodic component may be at a low frequency, such that it will show up in a 15 sec sample but not complete any cycles in a 1024/10k ~ 100ms sample (i.e., below 10 Hz or so), you could downsample your input. You could try something as crude as averaging every 100 points to get a somewhat-distorted signal at 100 Hz sampling rate, then pack 10.24 sec worth into your 1024 pt sequence to pass to the FFT.

You could combine these two approaches by using a smaller downsampling factor and then do the magnitude-averaging of successive windows.

I'm confused why the system provides an FFT only up to 1024 points - is there something about the memory that makes it harder to access larger blocks?

Calculating a 128k point FFT using a 1k FFT as a subroutine is possible, but you'd end up recoding a lot of the FFT yourself. Maybe you should forget about the system library and use some other FFT implementation, without the length limitation, that will compile on your target. It may not incorporate all the optimizations of the system-provided one, but you're likely to lose a lot of that advantage when you embed it within the custom code needed to use the partial outputs of the multiple shorter FFTs to produce the long FFT.

Probably the quickest way to do the hybrid FFT (1024 points using the library, then added code to combine them into a 128k point FFT) would be to take an existing full FFT routine (a radix-2, decimation-in-time (DIT) routine for instance), but then modify it to use the system library for what would have been the first 10 stages, which amount to calculating 128 individual 1024-point FFTs on different subsets of the original signal (not, unfortunately, successive windows, but the partial-bit-reversed subsets), then let the remaining 7 stages of butterflies operate on those partial outputs. You'd want to get a pretty solid understanding of how the DIT FFT works to implement this.

1024 pt fft on a large set of data points

1 Answers