Capturing/segmenting video on iOS and rejoining via HLS results in audio dropouts

Question

I'm attempting to capture video on an iPhone 5 for realtime upload and HLS streaming. I'm at the stage where I'm generating the video on the device (not yet uploading to the server). Like these links on SO suggest, I've hacked together some code that switches out AssetWriters every five seconds.

Right now during dev, I'm just saving the files to the device locally and pulling them out via XCode Organizer. I then run Apple's mediafilesegmenter to simply convert them to MPEG2-TS (they're below 10 seconds already, so there's no actual segmenting happening - I assume they're just being converted to TS). I build the m3u8 by editing together the various index files created during this process (also manually at the moment).

When I put the assets on a server for testing, they're mostly streamed correctly, but I can tell when there's a segment switch because the audio briefly drops (possibly the video too, but I can't tell for sure - it looks ok). This obviously doesn't happen for typical HLS streams segmented from one single input file. I'm at a loss as to what's causing this.

You can open my HLS stream on your iPhone here (you can hear the audio drop after 5 seconds and again around 10)

http://cdn.inv3ntion.com/ms/stitch/stitch.html

Could there be something happening in my creation process (either on the device or the post-processing) that's causing the brief audio drops? I don't think I'm dropping any sampleBuffer's during AssetWriter switch outs (see code).

- (void)writeSampleBuffer:(CMSampleBufferRef)sampleBuffer ofType:(NSString *)mediaType
{
    if (!self.isStarted) {
        return;
    }

    @synchronized(self) {

        if (mediaType == AVMediaTypeVideo && !assetWriterVideoIn) {
            videoFormat = CMSampleBufferGetFormatDescription(sampleBuffer);
            CFRetain(videoFormat);
            assetWriterVideoIn = [self addAssetWriterVideoInput:assetWriter withFormatDesc:videoFormat];
            [tracks addObject:AVMediaTypeVideo];
            return;
        }

        if (mediaType == AVMediaTypeAudio && !assetWriterAudioIn) {
            audioFormat = CMSampleBufferGetFormatDescription(sampleBuffer);
            CFRetain(audioFormat);
            assetWriterAudioIn = [self addAssetWriterAudioInput:assetWriter withFormatDesc:audioFormat];
            [tracks addObject:AVMediaTypeAudio];
            return;
        }

        if (assetWriterAudioIn && assetWriterVideoIn) {
            recording = YES;
            if (assetWriter.status == AVAssetWriterStatusUnknown) {
                if ([assetWriter startWriting]) {
                    [assetWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];
                    if (segmentationTimer) {
                        [self setupQueuedAssetWriter];
                        [self startSegmentationTimer];
                    }
                } else {
                    [self showError:[assetWriter error]];
                }
            }

            if (assetWriter.status == AVAssetWriterStatusWriting) {
                if (mediaType == AVMediaTypeVideo) {
                    if (assetWriterVideoIn.readyForMoreMediaData) {
                        if (![assetWriterVideoIn appendSampleBuffer:sampleBuffer]) {
                            [self showError:[assetWriter error]];
                        }
                    }
                }
                else if (mediaType == AVMediaTypeAudio) {
                    if (assetWriterAudioIn.readyForMoreMediaData) {
                        if (![assetWriterAudioIn appendSampleBuffer:sampleBuffer]) {
                            [self showError:[assetWriter error]];
                        }
                    }
                }
            }
        }
    }
}

- (void)setupQueuedAssetWriter
{
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), ^{
        NSLog(@"Setting up queued asset writer...");
        queuedFileURL = [self nextFileURL];
        queuedAssetWriter = [[AVAssetWriter alloc] initWithURL:queuedFileURL fileType:AVFileTypeMPEG4 error:nil];
        if ([tracks objectAtIndex:0] == AVMediaTypeVideo) {
            queuedAssetWriterVideoIn = [self addAssetWriterVideoInput:queuedAssetWriter withFormatDesc:videoFormat];
            queuedAssetWriterAudioIn = [self addAssetWriterAudioInput:queuedAssetWriter withFormatDesc:audioFormat];
        } else {
            queuedAssetWriterAudioIn = [self addAssetWriterAudioInput:queuedAssetWriter withFormatDesc:audioFormat];
            queuedAssetWriterVideoIn = [self addAssetWriterVideoInput:queuedAssetWriter withFormatDesc:videoFormat];
        }
    });
}

- (void)doSegmentation
{
    NSLog(@"Segmenting...");
    AVAssetWriter *writer = assetWriter;
    AVAssetWriterInput *audioIn = assetWriterAudioIn;
    AVAssetWriterInput *videoIn = assetWriterVideoIn;
    NSURL *fileURL = currentFileURL;

    //[avCaptureSession beginConfiguration];
    @synchronized(self) {
        assetWriter = queuedAssetWriter;
        assetWriterAudioIn = queuedAssetWriterAudioIn;
        assetWriterVideoIn = queuedAssetWriterVideoIn;
    }
    //[avCaptureSession commitConfiguration];
    currentFileURL = queuedFileURL;

    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0), ^{
        [audioIn markAsFinished];
        [videoIn markAsFinished];
        [writer finishWritingWithCompletionHandler:^{
            if (writer.status == AVAssetWriterStatusCompleted ) {
                [fileURLs addObject:fileURL];
            } else {
                NSLog(@"...WARNING: could not close segment");
            }
        }];
    });
}

szatmary szatmary · Accepted Answer · 2013-12-10T01:45:12

You can try inserting a #EXT-X-DISCONTINUITY between every segment in the m3u8, but I doubt this will work. There are a lot of thing that can be going wrong here.

Assuming you are sample audio at 44100kHz There is a new audio sample every 22 microseconds. During the time you are closing and reopening the file, you are definitely losing samples. If you concatenate the final wave form, it will play back slightly faster that real time due to this loss. In reality, this is probably not an issues.

As @vipw said, you will also have timestamp issues. Every time you start a new mp4, you are starting from timestamp zero. So, the player is getting confused, because the timestamps keep getting reset.

Also, is the transport stream format. The TS encapsulates each frame into 'streams'. HLS typically has 4 (PAT, PMT, Audio and Video), each stream is split into 188 byte packets with a 4 byte header. The headers have a 4 bit per stream continuity counter that wraps around on overflow. So, running mediafilesegmenter on every mp4, you are breaking the stream every segment by reseting the continuity counter back to zero.

You need a a tool that will accept mp4 and create a streaming output that maintains/rewrites timestamps (PTS, DTS, CTS), as well as continuity counters.

Capturing/segmenting video on iOS and rejoining via HLS results in audio dropouts

3 Answers