I have a question regarding a sound synthesis app that I'm working on. I am trying to read in an audio file, create randomized 'grains' using granular synthesis techniques, place them into an output buffer and then be able to play that back to the user using OpenAL. For testing purposes, I am simply writing the output buffer to a file that I can then listen back to.
Judging by my results, I am on the right track but am getting some aliasing issues and playback sounds that just don't seem quite right. There is usually a rather loud pop in the middle of the output file and volume levels are VERY loud at times.
Here are the steps that I have taken to get the results I need, but I'm a little bit confused about a couple of things, namely formats that I am specifying for my AudioStreamBasicDescription.
Read in an audio file from my mainBundle, which is a mono file in .aiff format:
ExtAudioFileRef extAudioFile; CheckError(ExtAudioFileOpenURL(loopFileURL, &extAudioFile), "couldn't open extaudiofile for reading"); memset(&player->dataFormat, 0, sizeof(player->dataFormat)); player->dataFormat.mFormatID = kAudioFormatLinearPCM; player->dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked; player->dataFormat.mSampleRate = S_RATE; player->dataFormat.mChannelsPerFrame = 1; player->dataFormat.mFramesPerPacket = 1; player->dataFormat.mBitsPerChannel = 16; player->dataFormat.mBytesPerFrame = 2; player->dataFormat.mBytesPerPacket = 2; // tell extaudiofile about our format CheckError(ExtAudioFileSetProperty(extAudioFile, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &player->dataFormat), "couldnt set client format on extaudiofile"); SInt64 fileLengthFrames; UInt32 propSize = sizeof(fileLengthFrames); ExtAudioFileGetProperty(extAudioFile, kExtAudioFileProperty_FileLengthFrames, &propSize, &fileLengthFrames); player->bufferSizeBytes = fileLengthFrames * player->dataFormat.mBytesPerFrame;
Next I declare my AudioBufferList and set some more properties
AudioBufferList *buffers; UInt32 ablSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1); buffers = (AudioBufferList *)malloc(ablSize); player->sampleBuffer = (SInt16 *)malloc(sizeof(SInt16) * player->bufferSizeBytes); buffers->mNumberBuffers = 1; buffers->mBuffers[0].mNumberChannels = 1; buffers->mBuffers[0].mDataByteSize = player->bufferSizeBytes; buffers->mBuffers[0].mData = player->sampleBuffer;
My understanding is that .mData will be whatever was specified in the formatFlags (in this case, type SInt16). Since it is of type (void *), I want to convert this to float data which is obvious for audio manipulation. Before I set up a for loop which just iterated through the buffer and cast each sample to a float*. This seemed unnecessary so now I pass in my .mData buffer to a function I created which then granularizes the audio:
float *theOutBuffer = [self granularizeWithData:(float *)buffers->mBuffers[0].mData with:framesRead];
In this function, I dynamically allocate some buffers, create random size grains, place them in my out buffer after windowing them using a hamming window and return that buffer (which is float data). Everything is cool up to this point.
Next I set up all my output file ASBD and such:
AudioStreamBasicDescription outputFileFormat; bzero(audioFormatPtr, sizeof(AudioStreamBasicDescription)); outputFileFormat->mFormatID = kAudioFormatLinearPCM; outputFileFormat->mSampleRate = 44100.0; outputFileFormat->mChannelsPerFrame = numChannels; outputFileFormat->mBytesPerPacket = 2 * numChannels; outputFileFormat->mFramesPerPacket = 1; outputFileFormat->mBytesPerFrame = 2 * numChannels; outputFileFormat->mBitsPerChannel = 16; outputFileFormat->mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked; UInt32 flags = kAudioFileFlags_EraseFile; ExtAudioFileRef outputAudioFileRef = NULL; NSString *tmpDir = NSTemporaryDirectory(); NSString *outFilename = @"Decomp.caf"; NSString *outPath = [tmpDir stringByAppendingPathComponent:outFilename]; NSURL *outURL = [NSURL fileURLWithPath:outPath]; AudioBufferList *outBuff; UInt32 abSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1); outBuff = (AudioBufferList *)malloc(abSize); outBuff->mNumberBuffers = 1; outBuff->mBuffers[0].mNumberChannels = 1; outBuff->mBuffers[0].mDataByteSize = abSize; outBuff->mBuffers[0].mData = theOutBuffer; CheckError(ExtAudioFileCreateWithURL((__bridge CFURLRef)outURL, kAudioFileCAFType, &outputFileFormat, NULL, flags, &outputAudioFileRef), "ErrorCreatingURL_For_EXTAUDIOFILE"); CheckError(ExtAudioFileSetProperty(outputAudioFileRef, kExtAudioFileProperty_ClientDataFormat, sizeof(outputFileFormat), &outputFileFormat), "ErrorSettingProperty_For_EXTAUDIOFILE"); CheckError(ExtAudioFileWrite(outputAudioFileRef, framesRead, outBuff), "ErrorWritingFile");
The file is written correctly, in CAF format. My question is this: am I handling the .mData buffer correctly in that I am casting the samples to float data, manipulating (granulating) various window sizes and then writing it to a file using ExtAudioFileWrite (in CAF format)? Is there a more elegant way to do this such as declaring my ASBD formatFlag as kAudioFlagIsFloat? My output CAF file has some clicks in it and when I open it in Logic, it looks like there is a lot of aliasing. This makes sense if I am trying to send it float data but there is some kind of conversion happening which I am unaware of.
Thanks in advance for any advice on the matter! I have been an avid reader of pretty much all the source material online, including the Core Audio Book, various blogs, tutorials, etc. The ultimate goal of my app is to play the granularized audio in real time to a user with headphones so the writing to file thing is just being used for testing at the moment. Thanks!