4
votes

I have a following code, which captures the video from the camera and stores it as a QuickMovie file using AVAssetWriter. It works fine, but the aspect ratio is not perfect because the width and height are hardcoded (480 x 320) in the outputSettings for AVAssetWriterInput.

I'd rather find out the aspect ratio of the source video, and specify the appropriate height (480 x aspect ratio). Does anybody know how to do it? Should I defer the creation of AssetWriterInput until the first sampleBuffer?

      // set the sessionPreset to 'medium'
      self.captureSession = [[AVCaptureSession alloc] init];
      self.captureSession.sessionPreset = AVCaptureSessionPresetMedium;
      ...

      // create AVCaptureVideoDataOutput
      self.captureVideo = [[AVCaptureVideoDataOutput alloc] init];
      NSString* formatTypeKey = (NSString*)kCVPixelBufferPixelFormatTypeKey;
      self.captureVideo.videoSettings = @{
        formatTypeKey:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA]
      };
      [self.captureVideo setSampleBufferDelegate:self queue:dispatch_get_main_queue()];

      // create an AVAssetWriter
      NSError* error = nil;
      self.videoWriter = [[AVAssetWriter alloc] initWithURL:url 
                             fileType:AVFileTypeQuickTimeMovie
                             error:&error];
      ...
      // create AVAssetWriterInput with specified settings
      NSDictionary* compression = @{
        AVVideoAverageBitRateKey:[NSNumber numberWithInt:960000],
        AVVideoMaxKeyFrameIntervalKey:[NSNumber numberWithInt:1]
      };
      self.videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
            outputSettings:@{
              AVVideoCodecKey:AVVideoCodecH264,
              AVVideoCompressionPropertiesKey:compression,
              AVVideoWidthKey:[NSNumber numberWithInt:480], // required
              AVVideoHeightKey:[NSNumber numberWithInt:320] // required
            }];

      // add it to the AVAssetWriter
      [self.videoWriter addInput:self.videoInput];

3

3 Answers

4
votes

Here's how that worked for me; the method you settled won't allow for scaling of your app down the line. You may as well learn how to do things correctly at the start—even at the expense of more time and effort—than not to.

In my app, after creating the asset writer...

_writer = [[AVAssetWriter alloc] initWithURL:_outURL fileType:AVFileTypeQuickTimeMovie error:outError];

...I create an video asset track...

 NSArray *videoTracks = [_asset tracksWithMediaType:AVMediaTypeVideo];
        if ([videoTracks count] > 0)
            assetVideoTrack = [videoTracks objectAtIndex:0];

...and then an asset reader track output object out of that:

_readerVideoOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:assetVideoTrack outputSettings:decompressionVideoSettings];
            [_reader addOutput:_readerVideoOutput];

Then, I load the video track's format description properties into an array, casting each element as a CMFormatDescriptionRef object as I access each property:

CMFormatDescriptionRef formatDescription = NULL;
NSArray *formatDescriptions = [assetVideoTrack formatDescriptions];
if ([formatDescriptions count] > 0)
    formatDescription = (__bridge CMFormatDescriptionRef)[formatDescriptions objectAtIndex:0];

Then, I attempt to load both the clean-aperture settings and pixel-aspect ratio settings, and then choose between them at the end:

if (formatDescription)
            {
                NSDictionary *cleanAperture = nil;
                NSDictionary *pixelAspectRatio = nil;
                CFDictionaryRef cleanApertureFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_CleanAperture);
                if (cleanApertureFromCMFormatDescription)
                {
                    cleanAperture = @{
                                      AVVideoCleanApertureWidthKey            : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureWidth),
                                      AVVideoCleanApertureHeightKey           : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHeight),
                                      AVVideoCleanApertureHorizontalOffsetKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHorizontalOffset),
                                      AVVideoCleanApertureVerticalOffsetKey   : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureVerticalOffset)
                                      };
                }
                CFDictionaryRef pixelAspectRatioFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_PixelAspectRatio);
                if (pixelAspectRatioFromCMFormatDescription)
                {
                    pixelAspectRatio = @{
                                         AVVideoPixelAspectRatioHorizontalSpacingKey : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioHorizontalSpacing),
                                         AVVideoPixelAspectRatioVerticalSpacingKey   : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioVerticalSpacing)
                                         };
                }
                // Add whichever settings we could grab from the format description to the compression settings dictionary.
                if (cleanAperture || pixelAspectRatio)
                {
                    NSMutableDictionary *mutableCompressionSettings = [NSMutableDictionary dictionary];
                    if (cleanAperture)
                        [mutableCompressionSettings setObject:cleanAperture forKey:AVVideoCleanApertureKey];
                    if (pixelAspectRatio)
                        [mutableCompressionSettings setObject:pixelAspectRatio forKey:AVVideoPixelAspectRatioKey];
                    compressionSettings = mutableCompressionSettings;
                }
            }

That's where you got confused; some video tracks have one, but not the other, and vice versa. So, you attempt to load them both, and then choose which set of properties came back full, and discard the set that did not.

Keep in mind that there is really only one way to nest all of the methods required to read and write a file to an iPhone, even while you do see a lot of variation. The most sound and prudent thing you could ever do is make sure you're doing it that one, right way.

If you're interested in seeing what that looks like, here it is:

#import "ExportVideo.h"

@implementation ExportVideo

@synthesize url = _url;
@synthesize renderer = _renderer;

- (id)initWithURL:(NSURL *)url usingRenderer:(GLKitView *)renderer {
    NSLog(@"ExportVideo");
    if (!(self = [super init])) {
        return nil;
    }

    self.url = url;
    self.renderer = renderer;

    NSString *serializationQueueDescription = [NSString stringWithFormat:@"%@ serialization queue", self];
    _mainSerializationQueue = dispatch_queue_create([serializationQueueDescription UTF8String], NULL);

    NSString *rwAudioSerializationQueueDescription = [NSString stringWithFormat:@"%@ rw audio serialization queue", self];
    _rwAudioSerializationQueue = dispatch_queue_create([rwAudioSerializationQueueDescription UTF8String], NULL);

    NSString *rwVideoSerializationQueueDescription = [NSString stringWithFormat:@"%@ rw video serialization queue", self];
    _rwVideoSerializationQueue = dispatch_queue_create([rwVideoSerializationQueueDescription UTF8String], NULL);

    return self;
}

- (void)startProcessing {
    NSDictionary *inputOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey];
    _asset = [[AVURLAsset alloc] initWithURL:self.url options:inputOptions];
    NSLog(@"URL: %@", self.url);
    _cancelled = NO;
    [_asset loadValuesAsynchronouslyForKeys:[NSArray arrayWithObject:@"tracks"] completionHandler: ^{
        dispatch_async(_mainSerializationQueue, ^{
            if (_cancelled)
                return;
            BOOL success = YES;
            NSError *localError = nil;
            success = ([_asset statusOfValueForKey:@"tracks" error:&localError] == AVKeyValueStatusLoaded);
            if (success)
            {
                NSFileManager *fm = [NSFileManager defaultManager];
                NSString *localOutputPath = [self.url path];
                if ([fm fileExistsAtPath:localOutputPath])
                    success = [fm removeItemAtPath:localOutputPath error:&localError];
            }
            if (success)
                success = [self setupAssetReaderAndAssetWriter:&localError];
            if (success)
                success = [self startAssetReaderAndWriter:&localError];
            if (!success)
                [self readingAndWritingDidFinishSuccessfully:success withError:localError];
        });
    }];
}


- (BOOL)setupAssetReaderAndAssetWriter:(NSError **)outError
{
    // Create and initialize the asset reader.
    _reader = [[AVAssetReader alloc] initWithAsset:_asset error:outError];
    BOOL success = (_reader != nil);
    if (success)
    {
        // If the asset reader was successfully initialized, do the same for the asset writer.
        NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
        _outputURL = paths[0];
        NSFileManager *manager = [NSFileManager defaultManager];
        [manager createDirectoryAtPath:_outputURL withIntermediateDirectories:YES attributes:nil error:nil];
        _outputURL = [_outputURL stringByAppendingPathComponent:@"output.mov"];
        [manager removeItemAtPath:_outputURL error:nil];
        _outURL = [NSURL fileURLWithPath:_outputURL];
        _writer = [[AVAssetWriter alloc] initWithURL:_outURL fileType:AVFileTypeQuickTimeMovie error:outError];
        success = (_writer != nil);
    }

    if (success)
    {
        // If the reader and writer were successfully initialized, grab the audio and video asset tracks that will be used.
        AVAssetTrack *assetAudioTrack = nil, *assetVideoTrack = nil;
        NSArray *audioTracks = [_asset tracksWithMediaType:AVMediaTypeAudio];
        if ([audioTracks count] > 0)
            assetAudioTrack = [audioTracks objectAtIndex:0];
        NSArray *videoTracks = [_asset tracksWithMediaType:AVMediaTypeVideo];
        if ([videoTracks count] > 0)
            assetVideoTrack = [videoTracks objectAtIndex:0];

        if (assetAudioTrack)
        {
            // If there is an audio track to read, set the decompression settings to Linear PCM and create the asset reader output.
            NSDictionary *decompressionAudioSettings = @{ AVFormatIDKey : [NSNumber numberWithUnsignedInt:kAudioFormatLinearPCM] };
            _readerAudioOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:assetAudioTrack outputSettings:decompressionAudioSettings];
            [_reader addOutput:_readerAudioOutput];
            // Then, set the compression settings to 128kbps AAC and create the asset writer input.
            AudioChannelLayout stereoChannelLayout = {
                .mChannelLayoutTag = kAudioChannelLayoutTag_Stereo,
                .mChannelBitmap = 0,
                .mNumberChannelDescriptions = 0
            };
            NSData *channelLayoutAsData = [NSData dataWithBytes:&stereoChannelLayout length:offsetof(AudioChannelLayout, mChannelDescriptions)];
            NSDictionary *compressionAudioSettings = @{
                                                       AVFormatIDKey         : [NSNumber numberWithUnsignedInt:kAudioFormatMPEG4AAC],
                                                       AVEncoderBitRateKey   : [NSNumber numberWithInteger:128000],
                                                       AVSampleRateKey       : [NSNumber numberWithInteger:44100],
                                                       AVChannelLayoutKey    : channelLayoutAsData,
                                                       AVNumberOfChannelsKey : [NSNumber numberWithUnsignedInteger:2]
                                                       };
            _writerAudioInput = [AVAssetWriterInput assetWriterInputWithMediaType:[assetAudioTrack mediaType] outputSettings:compressionAudioSettings];
            [_writer addInput:_writerAudioInput];
        }

        if (assetVideoTrack)
        {
            // If there is a video track to read, set the decompression settings for YUV and create the asset reader output.
            NSDictionary *decompressionVideoSettings = @{
                                                         (id)kCVPixelBufferPixelFormatTypeKey     : [NSNumber numberWithUnsignedInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange],
                                                         (id)kCVPixelBufferIOSurfacePropertiesKey : [NSDictionary dictionary]
                                                         };
            _readerVideoOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:assetVideoTrack outputSettings:decompressionVideoSettings];
            [_reader addOutput:_readerVideoOutput];
            CMFormatDescriptionRef formatDescription = NULL;
            // Grab the video format descriptions from the video track and grab the first one if it exists.
            NSArray *formatDescriptions = [assetVideoTrack formatDescriptions];
            if ([formatDescriptions count] > 0)
                formatDescription = (__bridge CMFormatDescriptionRef)[formatDescriptions objectAtIndex:0];
            CGSize trackDimensions = {
                .width = 0.0,
                .height = 0.0,
            };
            // If the video track had a format description, grab the track dimensions from there. Otherwise, grab them direcly from the track itself.
            if (formatDescription)
                trackDimensions = CMVideoFormatDescriptionGetPresentationDimensions(formatDescription, false, false);
            else
                trackDimensions = [assetVideoTrack naturalSize];
            NSDictionary *compressionSettings = nil;
            // If the video track had a format description, attempt to grab the clean aperture settings and pixel aspect ratio used by the video.
            if (formatDescription)
            {
                NSDictionary *cleanAperture = nil;
                NSDictionary *pixelAspectRatio = nil;
                CFDictionaryRef cleanApertureFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_CleanAperture);
                if (cleanApertureFromCMFormatDescription)
                {
                    cleanAperture = @{
                                      AVVideoCleanApertureWidthKey            : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureWidth),
                                      AVVideoCleanApertureHeightKey           : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHeight),
                                      AVVideoCleanApertureHorizontalOffsetKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHorizontalOffset),
                                      AVVideoCleanApertureVerticalOffsetKey   : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureVerticalOffset)
                                      };
                }
                CFDictionaryRef pixelAspectRatioFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_PixelAspectRatio);
                if (pixelAspectRatioFromCMFormatDescription)
                {
                    pixelAspectRatio = @{
                                         AVVideoPixelAspectRatioHorizontalSpacingKey : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioHorizontalSpacing),
                                         AVVideoPixelAspectRatioVerticalSpacingKey   : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioVerticalSpacing)
                                         };
                }
                // Add whichever settings we could grab from the format description to the compression settings dictionary.
                if (cleanAperture || pixelAspectRatio)
                {
                    NSMutableDictionary *mutableCompressionSettings = [NSMutableDictionary dictionary];
                    if (cleanAperture)
                        [mutableCompressionSettings setObject:cleanAperture forKey:AVVideoCleanApertureKey];
                    if (pixelAspectRatio)
                        [mutableCompressionSettings setObject:pixelAspectRatio forKey:AVVideoPixelAspectRatioKey];
                    compressionSettings = mutableCompressionSettings;
                }
            }
            // Create the video settings dictionary for H.264.
            NSMutableDictionary *videoSettings = (NSMutableDictionary *) @{
                                                                           AVVideoCodecKey  : AVVideoCodecH264,
                                                                           AVVideoWidthKey  : [NSNumber numberWithDouble:trackDimensions.width],
                                                                           AVVideoHeightKey : [NSNumber numberWithDouble:trackDimensions.height]
                                                                           };
            // Put the compression settings into the video settings dictionary if we were able to grab them.
            if (compressionSettings)
                [videoSettings setObject:compressionSettings forKey:AVVideoCompressionPropertiesKey];
            // Create the asset writer input and add it to the asset writer.
            _writerVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:[assetVideoTrack mediaType] outputSettings:videoSettings];
            [_writer addInput:_writerVideoInput];
        }
    }
    return success;
}

- (BOOL)startAssetReaderAndWriter:(NSError **)outError
{
    BOOL success = YES;
    // Attempt to start the asset reader.
    success = [_reader startReading];
    if (!success) {
        *outError = [_reader error];
        NSLog(@"Reader error");
    }
    if (success)
    {
        // If the reader started successfully, attempt to start the asset writer.
        success = [_writer startWriting];
        if (!success) {
            *outError = [_writer error];
            NSLog(@"Writer error");
        }
    }

    if (success)
    {
        // If the asset reader and writer both started successfully, create the dispatch group where the reencoding will take place and start a sample-writing session.
        _dispatchGroup = dispatch_group_create();
        [_writer startSessionAtSourceTime:kCMTimeZero];
        _audioFinished = NO;
        _videoFinished = NO;

        if (_writerAudioInput)
        {
            // If there is audio to reencode, enter the dispatch group before beginning the work.
            dispatch_group_enter(_dispatchGroup);
            // Specify the block to execute when the asset writer is ready for audio media data, and specify the queue to call it on.
            [_writerAudioInput requestMediaDataWhenReadyOnQueue:_rwAudioSerializationQueue usingBlock:^{
                // Because the block is called asynchronously, check to see whether its task is complete.
                if (_audioFinished)
                    return;
                BOOL completedOrFailed = NO;
                // If the task isn't complete yet, make sure that the input is actually ready for more media data.
                while ([_writerAudioInput isReadyForMoreMediaData] && !completedOrFailed)
                {
                    // Get the next audio sample buffer, and append it to the output file.
                    CMSampleBufferRef sampleBuffer = [_readerAudioOutput copyNextSampleBuffer];
                    if (sampleBuffer != NULL)
                    {
                        BOOL success = [_writerAudioInput appendSampleBuffer:sampleBuffer];
                        CFRelease(sampleBuffer);
                        sampleBuffer = NULL;
                        completedOrFailed = !success;
                    }
                    else
                    {
                        completedOrFailed = YES;
                    }
                }
                if (completedOrFailed)
                {
                    // Mark the input as finished, but only if we haven't already done so, and then leave the dispatch group (since the audio work has finished).
                    BOOL oldFinished = _audioFinished;
                    _audioFinished = YES;
                    if (oldFinished == NO)
                    {
                        [_writerAudioInput markAsFinished];
                    }
                    dispatch_group_leave(_dispatchGroup);
                }
            }];
        }

        if (_writerVideoInput)
        {
            // If we had video to reencode, enter the dispatch group before beginning the work.
            dispatch_group_enter(_dispatchGroup);
            // Specify the block to execute when the asset writer is ready for video media data, and specify the queue to call it on.
            [_writerVideoInput requestMediaDataWhenReadyOnQueue:_rwVideoSerializationQueue usingBlock:^{
                // Because the block is called asynchronously, check to see whether its task is complete.
                if (_videoFinished)
                    return;
                BOOL completedOrFailed = NO;
                // If the task isn't complete yet, make sure that the input is actually ready for more media data.
                while ([_writerVideoInput isReadyForMoreMediaData] && !completedOrFailed)
                {
                    // Get the next video sample buffer, and append it to the output file.
                    CMSampleBufferRef sampleBuffer = [_readerVideoOutput copyNextSampleBuffer];

                    /* PROCESS FRAME HERE */
                    //CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
                    //_currentBuffer = pixelBuffer;
                    //[self performSelectorOnMainThread:@selector(processFrame) withObject:nil waitUntilDone:YES];
                    /* ------------------ */

                    if (sampleBuffer != NULL)
                    {
                        BOOL success = [_writerVideoInput appendSampleBuffer:sampleBuffer];
                        CFRelease(sampleBuffer);
                        sampleBuffer = NULL;
                        completedOrFailed = !success;
                    }
                    else
                    {
                        completedOrFailed = YES;
                    }
                }
                if (completedOrFailed)
                {
                    // Mark the input as finished, but only if we haven't already done so, and then leave the dispatch group (since the video work has finished).
                    BOOL oldFinished = _videoFinished;
                    _videoFinished = YES;
                    if (oldFinished == NO)
                    {
                        [_writerVideoInput markAsFinished];
                    }
                    dispatch_group_leave(_dispatchGroup);
                }
            }];
        }
        // Set up the notification that the dispatch group will send when the audio and video work have both finished.
        dispatch_group_notify(_dispatchGroup, _mainSerializationQueue, ^{
            BOOL finalSuccess = YES;
            NSError *finalError = nil;
            // Check to see if the work has finished due to cancellation.
            if (_cancelled)
            {
                // If so, cancel the reader and writer.
                [_reader cancelReading];
                [_writer cancelWriting];
            }
            else
            {
                // If cancellation didn't occur, first make sure that the asset reader didn't fail.
                if ([_reader status] == AVAssetReaderStatusFailed)
                {
                    finalSuccess = NO;
                    finalError = [_reader error];
                    NSLog(@"_reader finalError: %@", finalError);
                }
                // If the asset reader didn't fail, attempt to stop the asset writer and check for any errors.
                [_writer finishWritingWithCompletionHandler:^{
                    UISaveVideoAtPathToSavedPhotosAlbum(_outputURL, nil, nil, nil);
                    [self readingAndWritingDidFinishSuccessfully:finalSuccess withError:[_writer error]];
                }];
            }
            // Call the method to handle completion, and pass in the appropriate parameters to indicate whether reencoding was successful.

        });
    }
    // Return success here to indicate whether the asset reader and writer were started successfully.
    return success;
}

- (void)readingAndWritingDidFinishSuccessfully:(BOOL)success withError:(NSError *)error
{
    if (!success)
    {
        // If the reencoding process failed, we need to cancel the asset reader and writer.
        [_reader cancelReading];
        [_writer cancelWriting];
        dispatch_async(dispatch_get_main_queue(), ^{
            // Handle any UI tasks here related to failure.
        });
    }
    else
    {
        // Reencoding was successful, reset booleans.
        _cancelled = NO;
        _videoFinished = NO;
        _audioFinished = NO;
        dispatch_async(dispatch_get_main_queue(), ^{
            // Handle any UI tasks here related to success.
        });
    }
    NSLog(@"readingAndWritingDidFinishSuccessfully success = %@ : Error = %@", (success == 0) ? @"NO" : @"YES", error);
}

@end
1
votes

One easy solution to this would be to use one of the fixed size AVCaptureSessionPresets, eg AVCaptureSessionPreset640x480. It seems like there is no public API to get the AVCaptureSession resolution before capturing begins:

Knowing resolution of AVCaptureSession's session presets

0
votes

Thank you a quick answer, Tark. Based on your answer, I have decided to defer the creation of AVAssetWriterInput until the very first frame, and it works great. Here is the code.

    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(imageBuffer, 0);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    CVPixelBufferUnlockBaseAddress(imageBuffer,0);
    NSLog(@"NVW size=%zd, %zd", width, height);

    NSDictionary* compression = @{
        AVVideoAverageBitRateKey:[NSNumber numberWithInt:960000],
        AVVideoMaxKeyFrameIntervalKey:[NSNumber numberWithInt:1]
    };
    self.videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo
        outputSettings:@{
            AVVideoCodecKey:AVVideoCodecH264,
            AVVideoCompressionPropertiesKey:compression,
            AVVideoWidthKey:[NSNumber numberWithInt:width],
            AVVideoHeightKey:[NSNumber numberWithInt:height]
        }];
    self.videoInput.expectsMediaDataInRealTime = YES;