FFmpeg AVFrame Audio Data Modification

Question

I'm trying to figure out how FFmpeg saves data in an AVFrame after the audio has been decoded.

Basically, if I print the data in the AVFrame->data[] array I get a number of unsigned 8 bit integers that is the audio in raw format.

From what I can understand from the FFmpeg doxygen, the format of the data is expressed in the enum AVSampleFormat and there are 2 main categories: interleaved and planar. In the interleaved type, the data is all kept in the first row of the AVFrame->data array with size AVFrame->linesize[0] while in the planar type each channel of the audio file is kept in a separate row of the AVFrame->data array and the arrays have as size AVFrame->linesize[0].

Is there a guide/tutorial that explains what do the numbers in the array mean for each of the formats?

Sergio Sergio · Accepted Answer · 2016-09-05T12:39:15

Values in each of the data arrays (planes) are actual audio samples according to specified format. E.g. if format is AV_SAMPLE_FMT_S16P it means that data arrays actually are arrays of int16_t PCM data. If we have deal with mono signal - only data[0] is valid, if it is stereo - data[0] and data[1] are valid, so on.

I'm not sure that there is any guide that can help you to explain each particular case but anyway described approach is quite simple and is easy to understand. You should just play a bit with it and thing should become clear.

FFmpeg AVFrame Audio Data Modification

1 Answers