Adding audio buffer [from file] to 'live' audio buffer [recording to file]

Question

What I'm trying to do:

Record up to a specified duration of audio/video, where the resulting output file will have a pre-defined background music from external audio-file added - without further encoding/exporting after recording.

As if you were recording video using the iPhones Camera-app, and all the recorded videos in 'Camera Roll' have background-songs. No exporting or loading after ending recording, and not in a separate AudioTrack.

How I'm trying to achieve this:

By using AVCaptureSession, in the delegate-method where the (CMSampleBufferRef)sample buffers are passed through, I'm pushing them to an AVAssetWriter to write to file. As I don't want multiple audio tracks in my output file, I can't pass the background-music through a separate AVAssetWriterInput, which means I have to add the background-music to each sample buffer from the recording while it's recording to avoid having to merge/export after recording.

The background-music is a specific, pre-defined audio file (format/codec: m4a aac), and will need no time-editing, just adding beneath the entire recording, from start to end. The recording will never be longer than the background-music-file.

Before starting the writing to file, I've also made ready an AVAssetReader, reading the specified audio-file.

Some pseudo-code(threading excluded):

-(void)startRecording
{
    /*
        Initialize writer and reader here: [...]
    */
    
    backgroundAudioTrackOutput = [AVAssetReaderTrackOutput 
                            assetReaderTrackOutputWithTrack:
                                backgroundAudioTrack 
                            outputSettings:nil];

    if([backgroundAudioReader canAddOutput:backgroundAudioTrackOutput])
        [backgroundAudioReader addOutput:backgroundAudioTrackOutput];
    else
        NSLog(@"This doesn't happen");

    [backgroundAudioReader startReading];

    /* Some more code */

    recording = YES;
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput 
             didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer 
             fromConnection:(AVCaptureConnection *)connection
{
    if(!recording)
        return;

    if(videoConnection)
        [self writeVideoSampleBuffer:sampleBuffer];
    else if(audioConnection)
        [self writeAudioSampleBuffer:sampleBuffer];
}

The AVCaptureSession is already streaming the camera-video and microphone-audio, and is just waiting for the BOOL recording to be set to YES. This isn't exactly how I'm doing this, but a short, somehow equivalent representation. When the delegate-method receives a CMSampleBufferRef of type Audio, I call my own method writeAudioSamplebuffer:sampleBuffer. If this was to be done normally, without a background-track as I'm trying to do, I'd simply put something like this: [assetWriterAudioInput appendSampleBuffer:sampleBuffer]; instead of calling my method. In my case though, I need to overlap two buffers before writing it:

-(void)writeAudioSamplebuffer:(CMSampleBufferRef)recordedSampleBuffer
{
    CMSampleBufferRef backgroundSampleBuffer = 
                     [backgroundAudioTrackOutput copyNextSampleBuffer];

    /* DO MAGIC HERE  */
    CMSampleBufferRef resultSampleBuffer = 
                         [self overlapBuffer:recordedSampleBuffer 
                            withBackgroundBuffer:backgroundSampleBuffer];
    /* END MAGIC HERE */

    [assetWriterAudioInput appendSampleBuffer:resultSampleBuffer];
}

The problem:

I have to add incremental sample buffers from a local file to the live buffers coming in. The method I have created named overlapBuffer:withBackgroundBuffer: isn't doing much right now. I know how to extract AudioBufferList, AudioBuffer and mData etc. from a CMSampleBufferRef, but I'm not sure how to actually add them together - however - I haven't been able to test different ways to do that, because the real problem happens before that. Before the Magic should happen, I am in possession of two CMSampleBufferRefs, one received from microphone, one read from file, and this is the problem:

The sample buffer received from the background-music-file is different than the one I receive from the recording-session. It seems like the call to [self.backgroundAudioTrackOutput copyNextSampleBuffer]; receives a large number of samples. I realize that this might be obvious to some people, but I've never before been at this level of media-technology. I see now that it was wishful thinking to call copyNextSampleBuffer each time I receive a sampleBuffer from the session, but I don't know when/where to put it.

As far as I can tell, the recording-session gives one audio-sample in each sample-buffer, while the file-reader gives multiple samples in each sample-buffer. Can I somehow create a counter to count each received recorded sample/buffers, and then use the first file-sampleBuffer to extract each sample, until the current file-sampleBuffer has no more samples 'to give', and then call [..copyNext..], and do the same to that buffer?

As I'm in full control of both the recording and the file's codecs, formats etc, I am hoping that such a solution wouldn't ruin the 'alignment'/synchronization of the audio. Given that both samples have the same sampleRate, could this still be a problem?

Note

I'm not even sure if this is possible, but I see no immediate reason why it shouldn't. Also worth mentioning that when I try to use a Video-file instead of an Audio-file, and try to continually pull video-sampleBuffers, they align up perfectly.

CRoig CRoig · Accepted Answer · 2014-09-25T09:24:45

I am not familiarized with AVCaptureOutput, since all my sound/music sessions were built using AudioToolbox instead of AVFoundation. However, I guess you should be able to set the size of the recording capturing buffer. If not, and you are still get just one sample, I would recommend you to store each individual data obtained from the capture output in an auxiliar buffer. When the auxiliar buffer reaches the same size as the file-reading buffer, then call [self overlapBuffer:auxiliarSampleBuffer withBackgroundBuffer:backgroundSampleBuffer];

I hope this would help you. If not, I can provide example about how to do this using CoreAudio. Using CoreAudio I have been able to obtain 1024 LCPM samples buffer from both microphone capturing and file reading. So the overlapping is immediate.