using libavcodec to decode video to YUV422

Question

I want to write a C++ application that opens a mp4 file and decode it as a yuv422 file. I wrote some code based on libavcodec tutorial but I couldn't find the place to set the bit depth and the format to YUV422.

here is the some of the code i wrote

void read_video_stream(AVFormatContext *pFormatContext, AVCodec *pCodec, AVCodecParameters *pCodecParameters,
                       int video_stream_index)
{
    AVCodecContext *pCodecContext = avcodec_alloc_context3(pCodec);
    std::unique_ptr<AVCodecContext, av_deleter> ctx_guard(pCodecContext);
    if (!pCodecContext) {
        return;
    }
    if (avcodec_parameters_to_context(pCodecContext, pCodecParameters) < 0) {
        return;
    }
    // i tried setting it here
    if (avcodec_open2(pCodecContext, pCodec, NULL) < 0) {
        return;
    }
    while (true) {
        std::unique_ptr<AVPacket, std::function<void(AVPacket*)>> packet{
                        new AVPacket,
                        [](AVPacket* p){ av_packet_unref(p); delete p; }};

        av_init_packet(packet.get());
        int response = av_read_frame(pFormatContext, packet.get());
        if (AVERROR_EOF == response) {
            std::cout << "EOF\n";
        }
        else if (response < 0) {
            std::cout << "Error " << response;
            return;
        }
        if (packet->stream_index != video_stream_index) {
            continue;
        }
        response = avcodec_send_packet(pCodecContext, packet.get());
        if (response < 0) {
            std::cout << "Error while sending a packet to the decoder: " << response;
            return;
        }
        while (response >= 0) {
            std::shared_ptr<AVFrame> pFrame{  av_frame_alloc(), AVFrameDeleter};
            response = avcodec_receive_frame(pCodecContext, pFrame.get());
            if (response == AVERROR(EAGAIN)) {
                continue;
            }
            if (response == AVERROR_EOF) {
                std::cerr << "got to last frame\n";
                return;
            }
            else if (response < 0) {
                std::cerr << "Error while receiving a frame from the decoder: " << response;
                return;
            }

            if (response >= 0) {
                // copy line to cyclic buffer
                cb.push_back(std::move(pFrame));

            }
        }
    }
}

my end goal is to send the uncompressed data (need to be in pFrame->data[0-2]) to a device in the network. can you please help me with this issue thanks

You don’t decode to a spacific format. You decode to what ever format the stream was encoded to. If you want a different format, you convert (with swacale) after it’s decided. — szatmary
Which specific format of uncompressed data does the network device expect? — Alex Cohn

the kamilz the kamilz · Accepted Answer · 2018-08-15T07:35:35

The comment says it all "You don’t decode to a spacific format. You decode to what ever format the stream was encoded to. If you want a different format, you convert (with swacale) after it’s decided."

I'll fill the gaps. Most majority of movie files (mp4) originally YUV420P format, so this is what you'll get after decoding. But this may be different to so I call it Decoded Pixel Format. After you get AVFrame with Decoded Pixel Format you can convert it to any other pixel format with swscale() function.

Pixel format conversion (with swscale()) requires two things:

1) Set up scaler context. In FFmpeg, pixel format conversion and scaling is done via same function. Keep the both sizes same and you'll get no scaling, just conversion.

You don't need to do this more than once unless the parameters changed and no longer valid:

SwsContext *swsContext = sws_getContext(src_width, src_height, src_pixfmt,
                                        dst_width, dst_height, dst_pixfmt,
                                        SWS_BILINEAR, NULL, NULL, NULL);

Example src_pixfmt is AV_PIX_FMT_YUV420P
Example dst_pixfmt is AV_PIX_FMT_YUV422P or AV_PIX_FMT_UYVY422
SWS_BILINEAR is the scaling algorithm. You may need this some point when scaling also needed. They say Bilinear is good for upscaling and Bicubic good for downscaling. I'm no expert in this field. But what I know is bilinear is works well and faster than many other algorithms.

2) And to do conversion you'll need something like this:

AVFrame *dstframe = av_frame_alloc();
if (dstframe == NULL)
{
    fprintf(stderr, "Error: av_frame_alloc() failed.\n");
    exit(EXIT_FAILURE);
}

dstframe->format = AV_PIX_FMT_UYVY422; /* choose same format set on sws_getContext() */
dstframe->width  = srcframe->width; /* must match sizes as on sws_getContext() */
dstframe->height = srcframe->height; /* must match sizes as on sws_getContext() */
int ret = av_frame_get_buffer(dstframe, 32);
if (ret < 0)
{
    fprintf(stderr, "Error: could not allocate the video frame data\n");
    exit(EXIT_FAILURE);
}

/* do the conversion */
ret = sws_scale(swsContext,             /* SwsContext* on step (1) */
                srcframe->data,         /* srcSlice[] from decoded AVFrame */
                srcframe->linesize,     /* srcStride[] from decoded AVFrame */
                0,                      /* srcSliceY   */
                src_height,             /* srcSliceH  from decoded AVFrame */
                dstframe->data,         /* dst[]       */
                dstframe->linesize);    /* dstStride[] */

if (ret < 0)
{
    /* error handling */
}

After successful conversion you'll have what you want on dstframe.

Check more formats, details on functions and parameters here: https://www.ffmpeg.org/doxygen/trunk/index.html

Hope that helps.

using libavcodec to decode video to YUV422

1 Answers