6
votes

NOTE: This is not a duplicate, I have specific requirements other than related questions.

To start with, I want to plot the spectrum of an audio file (.wav) just like what audacity does (similar: How to draw a frequency spectrum from a Fourier transform).

So far I am able to read and write wav files. But my problem is I don't know exactly what values I need to pass to the FFT function. By the way I am using Exocortex for FFT in C#. The FFT function requires me to pass an array of complex numbers with the right size (512, 1024, ... I presume), an optional integer parameter for length, and the fourier direction (forward/backward).

Specific Questions:

  1. The Complex (class) from the Exocortex library has two values namely Real and Imaginary. I have the array of samples, so which should be real and which should be Imaginary?
  2. I have the wav file, so the length should be assumed variable. How do I pass that to the FFT function? Should I select a size (512/1024/etc), divide the the entire samples to the size, then pass all of it to the FFT?
  3. How do i know what frequencies should be listed down on the x-axis?
  4. How do I plot the FFT'ed data? (I want the x-axis to be frequency, and y-axis in decibels)

If you don't get what I mean, then try to use Audacity, import an audio file, then click Analyze > Plot Spectrum. Those are the things want to recreate. Please answer my question in details because I really want to learn this. I only have a little background on this. I am just a newbie in digital signal processing. Also as much as possible please don't direct me to other FFT sites because they don't answer my question specifically.


EDIT:

I've done some reading and found out how to FFT an audio data but only in powers of 2. So how do I do the same in an audio file with a length that's not of powers of 2? According to some I need to use "window". I've also done some searching about it and found out that it only takes an portion of the waveform to be processed later. Remember above that I want to get the FFT of the audio file not a portion of it. So what should I do now? Please help :(

1
Regarding your note - 1. You are asking too much for a single question. This is more Math stuff and you'll get much better (and more) answers in forums dedicated to it. 2. If your files contains a number of samples which is not a power of two, you need to pad it with zeros to make it's size a power of two. This is a limitation of FFT. 3. Basically using a window is done to perform the FFT on a portion of your data (if that is what you need). But it goes much deeper, there are different windows with different effects.Vadim
What you are trying to generate is called a spectrogram. You would be better to ask how to plot a spectrogram than to ask about the spectrum.Ross Bencina

1 Answers

7
votes

The signature is

public static void  FFT( float[] data, int length, FourierDirection direction )
  1. You pass an array of complex numbers, represented as pairs. Since you only have real numbers (the samples), you should put your samples in the even locations in the array - data[0], data[2], data[4] and so on. Odd locations should be 0, data[1] = data[3] = 0...
  2. The length is the amount of samples you want to calculate your FFT on, it should be exactly half of the length of the data array. You can FFT your entire WAV or parts of it - depends on what you wish to see. Audacity will plot the power spectrum of the selected part of the file, if you wish to do the same, pass the entire WAV or the selected parts.
  3. FFT will only show you frequencies up to half of your sampling rate. So you should have values between 0 and half your sampling rate. The amount of values depends on the amount of samples you have (the amount of samples will affect the precision of the calculation)
  4. Audacity plots the power spectrum. You should take each complex number pair in the array you receive and calculate its ABS. ABS is defined as sqrt(r^2+i^2). Each ABS value will correspond to a single frequency.

Here's an example of a working code:

float[] data = new float[8];
data[0] = 1; data[2] = 1; data[4] = 1; data[6] = 1;
Fourier.FFT(data, data.Length/2, FourierDirection.Forward);

I'm giving it 4 samples, all the same. So I expect to get something only at frequency 0. And indeed, after running it, I get

data[0] == 1, data[2] == 1, data[4] == 1, data[6] == 1

And others are 0.

If I want to use the Complex array overload

Complex[] data2 = new Complex[4];
data2[0] = new Complex(1,0);
data2[1] = new Complex(1, 0);
data2[2] = new Complex(1, 0);
data2[3] = new Complex(1, 0);
Fourier.FFT(data2,data2.Length,FourierDirection.Forward);

Please note that here the second parameter equals the length of the array, since each array member is a complex number. I get the same result as before.

I think I missed the complex overload before. I seems less error prone and more natural to use, unless your data already comes in pairs.