I am trying to display a spectrum analyser for iOS and am stuck after two weeks. I have read pretty much every post about FFT and the Accelerate Frameworks on here and have downloaded the aurioTouch2 example from Apple.
I think I understand the mechanism of FFT (did it in Uni 20 years ago) and am a fairly experienced iOS programmer but I have hit a wall.
I am using AudioUnit to play mp3, m4a, and wav files and have that working beautifully. I have attached a Render Callback to the AUGraph and I can plot Waveforms to the music. The waveform goes with the music nicely.
When I take the data from the Render Callback which is in Float form in the range 0 .. 1 and attempt to pass that through the FFT code (either my own or aurioTouch2's FFTBufferManager.mm) I get something thats not completely wrong, but is not correct either. or instance this is a 440Hz sine wave:
That peak value is -6.1306, followed by -24. -31., -35. and those values towards the end are around -63.
Animated gif for "Black Betty":
The format I receive from the Render callback:
AudioStreamBasicDescription outputFileFormat;
outputFileFormat.mSampleRate = 44100;
outputFileFormat.mFormatID = kAudioFormatLinearPCM;
outputFileFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsNonInterleaved;
outputFileFormat.mBitsPerChannel = 32;
outputFileFormat.mChannelsPerFrame = 2;
outputFileFormat.mFramesPerPacket = 1;
outputFileFormat.mBytesPerFrame = outputFileFormat.mBitsPerChannel / 8;
outputFileFormat.mBytesPerPacket = outputFileFormat.mBytesPerFrame;
In looking at the aurioTouch2 example it looks like they are receiving their data in a signed int format but then running an AudioConverter to convert it to Float. Their format is hard to decipher but is using a macro:
drawFormat.SetAUCanonical(2, false);
drawFormat.mSampleRate = 44100;
XThrowIfError(AudioConverterNew(&thruFormat, &drawFormat, &audioConverter), "couldn't setup AudioConverter");
In their render callback they are copying the data out of the AudioBufferList into mAudioBuffer (Float32*) and passing it to the CalculateFFT method which calls vDSP_ctoz
//Generate a split complex vector from the real data
vDSP_ctoz((COMPLEX *)mAudioBuffer, 2, &mDspSplitComplex, 1, mFFTLength);
I think this is where my problem is. What format does vDSP_ctoz expect? It is cast as a (COMPLEX*) but I cannot find anywhere in the aurioTouch2 code which puts the mAudioBuffer data into the (COMPLEX*) format. So is must be coming from the Render Callback in this format?
typedef struct DSPComplex {
float real;
float imag;
} DSPComplex;
typedef DSPComplex COMPLEX;
If I don't have the format correct at this point (or understand the format) then there is no point in debugging the rest of it.
Any help would be greatly appreciated.
Code from AurioTouch2 that I am using:
Boolean FFTBufferManager::ComputeFFTFloat(Float32 *outFFTData)
{
if (HasNewAudioData())
{
// Added after Hotpaw2 comment.
UInt32 windowSize = mFFTLength;
Float32 *window = (float *) malloc(windowSize * sizeof(float));
memset(window, 0, windowSize * sizeof(float));
vDSP_hann_window(window, windowSize, 0);
vDSP_vmul( mAudioBuffer, 1, window, 1, mAudioBuffer, 1, mFFTLength);
// Added after Hotpaw2 comment.
DSPComplex *audioBufferComplex = new DSPComplex[mFFTLength];
for (int i=0; i < mFFTLength; i++)
{
audioBufferComplex[i].real = mAudioBuffer[i];
audioBufferComplex[i].imag = 0.0f;
}
//Generate a split complex vector from the real data
vDSP_ctoz((COMPLEX *)audioBufferComplex, 2, &mDspSplitComplex, 1, mFFTLength);
//Take the fft and scale appropriately
vDSP_fft_zrip(mSpectrumAnalysis, &mDspSplitComplex, 1, mLog2N, kFFTDirection_Forward);
vDSP_vsmul(mDspSplitComplex.realp, 1, &mFFTNormFactor, mDspSplitComplex.realp, 1, mFFTLength);
vDSP_vsmul(mDspSplitComplex.imagp, 1, &mFFTNormFactor, mDspSplitComplex.imagp, 1, mFFTLength);
//Zero out the nyquist value
mDspSplitComplex.imagp[0] = 0.0;
//Convert the fft data to dB
vDSP_zvmags(&mDspSplitComplex, 1, outFFTData, 1, mFFTLength);
//In order to avoid taking log10 of zero, an adjusting factor is added in to make the minimum value equal -128dB
vDSP_vsadd( outFFTData, 1, &mAdjust0DB, outFFTData, 1, mFFTLength);
Float32 one = 1;
vDSP_vdbcon(outFFTData, 1, &one, outFFTData, 1, mFFTLength, 0);
free( audioBufferComplex);
free( window);
OSAtomicDecrement32Barrier(&mHasAudioData);
OSAtomicIncrement32Barrier(&mNeedsAudioData);
mAudioBufferCurrentIndex = 0;
return true;
}
else if (mNeedsAudioData == 0)
OSAtomicIncrement32Barrier(&mNeedsAudioData);
return false;
}
After reading the answer below I tried adding this to the top of the method:
DSPComplex *audioBufferComplex = new DSPComplex[mFFTLength];
for (int i=0; i < mFFTLength; i++)
{
audioBufferComplex[i].real = mAudioBuffer[i];
audioBufferComplex[i].imag = 0.0f;
}
//Generate a split complex vector from the real data
vDSP_ctoz((COMPLEX *)audioBufferComplex, 2, &mDspSplitComplex, 1, mFFTLength);
And the result I got was this:
I am now rendering the 5 last results, they are the faded ones behind.
After adding hann window:
Now looks a lot better after applying the hann window (Thanks hotpaw2). Not worried about the mirror image.
My main problem now is using a real song it doesn't look like other Spectrum Analysers. Everything is always pushed high on the left no matter what music i push through it. After applying the window it seems to go to the beat a lot better though.
memset
before callingvDSP_hann_window
isn't needed. – Valeriy Van