4
votes

I have an audio app i need to capture mic samples to encode into mp3 with ffmpeg

First configure the audio:

/**  
     * We need to specifie our format on which we want to work.
     * We use Linear PCM cause its uncompressed and we work on raw data.
     * for more informations check.
     * 
     * We want 16 bits, 2 bytes (short bytes) per packet/frames at 8khz 
     */
    AudioStreamBasicDescription audioFormat;
    audioFormat.mSampleRate         = SAMPLE_RATE;
    audioFormat.mFormatID           = kAudioFormatLinearPCM;
    audioFormat.mFormatFlags        = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
    audioFormat.mFramesPerPacket    = 1;
    audioFormat.mChannelsPerFrame   = 1;
    audioFormat.mBitsPerChannel     = audioFormat.mChannelsPerFrame*sizeof(SInt16)*8;
    audioFormat.mBytesPerPacket     = audioFormat.mChannelsPerFrame*sizeof(SInt16);
    audioFormat.mBytesPerFrame      = audioFormat.mChannelsPerFrame*sizeof(SInt16);

The recording callback is:

static OSStatus recordingCallback(void *inRefCon, 
                                  AudioUnitRenderActionFlags *ioActionFlags, 
                                  const AudioTimeStamp *inTimeStamp, 
                                  UInt32 inBusNumber, 
                                  UInt32 inNumberFrames, 
                                  AudioBufferList *ioData) 
{
    NSLog(@"Log record: %lu", inBusNumber);
    NSLog(@"Log record: %lu", inNumberFrames);
    NSLog(@"Log record: %lu", (UInt32)inTimeStamp);

    // the data gets rendered here
    AudioBuffer buffer;

    // a variable where we check the status
    OSStatus status;

    /**
     This is the reference to the object who owns the callback.
     */
    AudioProcessor *audioProcessor = (__bridge AudioProcessor*) inRefCon;

    /**
     on this point we define the number of channels, which is mono
     for the iphone. the number of frames is usally 512 or 1024.
     */
    buffer.mDataByteSize = inNumberFrames * sizeof(SInt16); // sample size
    buffer.mNumberChannels = 1; // one channel

    buffer.mData = malloc( inNumberFrames * sizeof(SInt16) ); // buffer size

    // we put our buffer into a bufferlist array for rendering
    AudioBufferList bufferList;
    bufferList.mNumberBuffers = 1;
    bufferList.mBuffers[0] = buffer;

    // render input and check for error
    status = AudioUnitRender([audioProcessor audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList);
    [audioProcessor hasError:status:__FILE__:__LINE__];

    // process the bufferlist in the audio processor
    [audioProcessor processBuffer:&bufferList];

    // clean up the buffer
    free(bufferList.mBuffers[0].mData);


    //NSLog(@"RECORD");
    return noErr;
}

With data:

inBusNumber = 1

inNumberFrames = 1024

inTimeStamp = 80444304 // All the time same inTimeStamp, this is strange

However, the framesize that i need to encode mp3 is 1152. How can i configure it?

If i do buffering, that implies a delay, but i would like to avoid this because is a real time app. If i use this configuration, each buffer i get trash trailing samples, 1152 - 1024 = 128 bad samples. All samples are SInt16.

1
This doesn't address your question directly but you should avoid calling Objective-C or any blocking functions (such as malloc or free) in your render callbacks.sbooth
i was solve this problem you can see my answer here. stackoverflow.com/a/65053984/7773867Ahmed

1 Answers

2
votes

You can configure the number of frames per slice an AudioUnit will use with the property kAudioUnitProperty_MaximumFramesPerSlice. However, I think the best solution in your case is to buffer the incoming audio to a ring buffer and then signal your encoder that audio is available. Since you're transcoding to MP3 I'm not sure what real-time means in this case.