2
votes

What is the process of generating .mov or .m4a file using arrays of Int16 as sterio channel for audio?

I can easily generate raw PCM data as [Int16] from a .mov file and store it in two files leftChannel.pcm and rightChannel.pcm and perform some operations for later use. But I am not able to regenerate the video from these files.

Any process, i.e. direct video generation using raw PCM or using intermediate step of generating m4a from PCM will work.

Update:

I figured out how to convert the PCM array to audio file. But it won't play.

private func convertToM4a(leftChannel leftPath : URL, rightChannel rigthPath : URL, converterCallback : ConverterCallback){

    let m4aUrl = FileManagerUtil.getTempFileName(parentFolder: FrameExtractor.PCM_ENCODE_FOLDER, fileNameWithExtension: "encodedAudio.m4a")
    if FileManager.default.fileExists(atPath: m4aUrl.path) {
        try! FileManager.default.removeItem(atPath: m4aUrl.path)
    }
    do{
        let leftBuffer = try NSArray(contentsOf: leftPath, error: ()) as! [Int16]
        let rightBuffer = try NSArray(contentsOf: rigthPath, error: ()) as! [Int16]

        let sampleRate = 44100
        let channels = 2
        let frameCapacity = (leftBuffer.count + rightBuffer.count)/2

        let outputSettings = [
            AVFormatIDKey : NSInteger(kAudioFormatMPEG4AAC),
            AVSampleRateKey : NSInteger(sampleRate),
            AVNumberOfChannelsKey : NSInteger(channels),
            AVAudioFileTypeKey : NSInteger(kAudioFileAAC_ADTSType),
            AVLinearPCMIsBigEndianKey : true,
            ] as [String : Any]

        let audioFile = try AVAudioFile(forWriting: m4aUrl, settings: outputSettings, commonFormat: .pcmFormatInt16, interleaved: false)

        let format = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(sampleRate), channels: AVAudioChannelCount(channels), interleaved: false)!

        let pcmBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameCapacity))!
        pcmBuffer.frameLength = pcmBuffer.frameCapacity

        for i in 0..<leftBuffer.count {
            pcmBuffer.int16ChannelData![0][i] = leftBuffer[i]
        }

        for i in 0..<rightBuffer.count {
            pcmBuffer.int16ChannelData![1][i] = rightBuffer[i]
        }

        try! audioFile.write(from: pcmBuffer)

        converterCallback.m4aEncoded(to: m4aUrl)

    } catch {
        print(error.localizedDescription)
    }
}

Saving it as .m4a with AVAudioFileTypeKey as m4a type was giving malformed file error.

Saving it as .aac with above settings plays the file but with broken sound. Just the buzzing sound with some slow mo effect of the original audio, initially I thought that it is something to do with the input and output of sampling rate but that was not the case.

I assume that something is wrong in Output Dictionary. Any help would be appreciated.

1
PCM is the lingua franca of digital audio ... all audio codec ultimately get inverted down to PCM when interacting with the ADC or DAC hardware devices ... as such when starting with PCM is natural to convert it into any and all audio codec ... infact WAV format is simply raw PCM with a 44 byte header strapped in frontScott Stensland

1 Answers

0
votes

At least the creation of the AAC file with the code you are showing works.

I wrote out two NSArrays with valid Int16 audio data and with your code get a valid result that e.g. when played with (using suffix .aac) in QuickTime Player sounds the same as the input.

encoded audio

How are you creating the input?

Buzzing sound (with lots of noise) is e.g. happening if you reading in audio data using AVAudioFormat with e.g. .pcmFormatInt16 format but the data actually read is in .pcmFormatFloat32 format (most commonly default format). There is unfortunately no runtime warning if you try to do so.

If that's the case try to use .pcmFormatFloat32. If you need it in Int16 you can convert it yourself by basically mapping [-1,1] to [-32768,32767] for both channels.

let fac = Float(1 << 15)
for i in 0..<count {
    let val = min(max(inBuffer!.floatChannelData![ch][i] * fac, -fac), fac - 1)
    xxx[I] = Int16(val)
}
...