0
votes

I'm trying to make a simple music visualization application, I understand I need to take my audio samples and perform a Fast Fourier Transformation. I'm trying to find out how to determine what the scale of the magnitude is, so I can normalize it to be between 0.0 and 1.0 for plotting purposes.

My application is setup to allow reading audio in 16-bit and 24-bit format, so I scale all incoming audio samples to [-1.0,1.0), then I use a real-to-complex 1-dimensional transform for N samples.

From there, I think I need to take the absolute value of each bin (using the cabs function) between 0 and N/2, but I'm not sure what these numbers really represent or what I'm supposed to do with them.

I've figured out how to calculate the frequency of each bin, I'm not interested in finding the actual magnitude or amplitude in decibels, I really just want to get a value between 0.0 and 1.0.

Most explanations for fftw involve a lot of math that is honestly way above my head.

1
You have scaled the sample to the range [-1.0, 1.0] so to now range it as [0.0, 1.0] you add 1.0 and halve it.Weather Vane
The question title sounds misleading, the question suggests you want to scale FFT output. Store the cabs() values in an array, find the maximum value and divide all array elements by that maximum value.Hans Passant
If you know the bin frequencies, you can generate sample input that is a sine wave at one bin frequency, or a mixture of 2 sine waves at two different bin frequencies. Then you can play with the amplitudes, and see what comes out of the FFT. FFTs are normally linear, i.e. changing the input amplitude changes the output magnitude by the same proportion.user3386109
@WeatherVane: The [-1, 1] scale is prior to the transform. DFTs typically scale the data, likely by N. Further, the [-1, 1] scale is likely a map from the maximum of the sensor data to 1. The maximum of the actual audio would typically be less. Further, the output of the transform is a sequence of complex numbers, so the data they need to use for the plot is not in any real interval.Eric Postpischil
@EricPostpischil I see now that OP says "trying to find out how to determine what the scale of the magnitude is" so yes, it's not the actual range.Weather Vane

1 Answers

0
votes

[Per comments, OP seeks to know the maximum possible magnitude of any output bin given inputs in [−1, 1]. This answer gives a way to determine that.]

DFT routines vary in how they handle scaling. Some normalize their output to keep the scale the same, and some let the arithmetic operations grow the scale for better performance or implementation convenience. So the possible scale of the output is not determined solely by mathematics; it depends on the routine used. The documentation of the routine ought to state what scaling it uses.

In the absence of clear documenrtation, you can determine the maximum output by writing a sine wave with amplitude one to the input (and a frequency matching one of the output bins), then performing the transform, and then examining the output to see which bin has the largest magnitude (it should be the one whose frequency you used, of course). It will likely be 1 or N (the number of inputs), with some slop due to floating-point rounding effects.

(When plotting, be sure to allow a little leeway for floating-point rounding effects—the actual numbers could be slightly greater than the maximum, so avoid overflowing or clipping where you do not want that.)