1
votes

I'm learning video and audio codecs with FFmpeg. I'm strgulling to understand frame size and some other concepts.

Frame size This is the size in bytes of each frame. This can vary a lot: if each sample is 8 bits, and we’re handling mono sound, the frame size is one byte. Similarly in 6 channel audio with 64 bit floating point samples, the frame size is 48 bytes (PCM Terminology and Concepts)

As described above, if each sample is 8 bits and there are 6 channels, then the frame size will be 48 bytes. The result from my code was 96 (16 bits * 6 channels). On the other side, the result from the call stream->codecpar->frame_size was 1024. Why were they different? Is the frame size 1024 for 6 channels or just each channel?

main.cpp:

else if (stream->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
    std::cout << "audio sample rate: " << stream->codecpar->sample_rate << std::endl;
    std::cout << "audio bits: " << stream->codecpar->bits_per_coded_sample << std::endl;
    std::cout << "audio channels: " << stream->codecpar->channels << std::endl;
    std::cout << "audio frame size: " << stream->codecpar->frame_size << std::endl; 
    std::cout << "audio frame size: (16bits * 6 channels) = " << stream->codecpar->channels * stream->codecpar->bits_per_coded_sample << std::endl; 
    audio_stream_index = i;
}

console:

audio sample rate: 48000
audio bits: 16
audio channels: 6
audio frame size: 1024
audio frame size: (16bits * 6 channels) = 96
1

1 Answers

1
votes

In FFmpeg, audio frame size refers to samples, not bytes. So one audio frame of a 16-bit 4-channel PCM stream will have 1024 x 16 x 4 = 65536 bits = 8192 bytes.