0
votes

I have been on homework about audio signal processing. I have read some paper and am confused about a formula:enter image description here. The formula is used to process a 44100Hz, 16 bit, single channel audio. The audio has been preprocessed and is sliced into 1024 width frames. The F(w) is the FFT coefficients of each frame and w with - above is half of the frame rate, 22050.

I have searched a lot, most important Analyze audio using Fast Fourier Transform. But I still can not understand it clearly. I have got the FFT coeffcients, with scipy and numpy, a 1024 width array. Then how do I perform the formula? Does it equal to the sum of 0 to 512 values of the array?

Hope anybody can help me. Thanks in advance.

1
What have you already tried? We generally expect to see source code of you attempt.marko
and is F(w) really the FFT co-efficents of each frame? Looks like a function to me. It might be useful to see the equation for it.marko
I have searched a lot on Google and stackoverflow. I have given what I think is the most helpful resource to me. I have read the audio signal, preprocess it and slice it into frames. I have done FFT on the frames. It seems that what I have got is the FFT coefficients. But I don't know how to use the FFT coefficients to perform the formula. I also hope the F(w) is a equation. If it is, there is no need for me to come here. All the papers I read just tell me that F(w) is the FFT coefficients of each frame. @Markozhangyangyu

1 Answers

0
votes

Assuming you've got a signal x = [ x_1, x_2, ..., x_N ] then you would compute the formula above in python (with scipy imported):

E = sum( abs(fft(x))[:len(x)/2]**2 ) / len(x)

About the normalization factor N = len(x) I'm not 100% sure — this depends on the exact implementation of the fft.