7
votes

I'm using ffmpeg libraries to decode, scale, and re-encode video within an MPEG transport stream. I've just recompiled from source to v3.3.2 and changed from the old avcodec_decode_video2() API to the new send/receive API.

Both the old and new APIs decode the video very slowly.

25 fps video = 1 frame every 40ms. However, I see 70 to 120ms per frame to decode. This is a file translator so need it to run faster than real time.

The code outline is below. Anyone have any ideas on how to improve the decoding speed? There are other posts about the deprecated avcodec_decode_video2() being slow; none of those were resolved. The new API doesn't run any faster...

gettimeofday(&tv1, NULL);
int rc = av_read_frame(pFormatContext, pESPacket);
gettimeofday(&tv2, NULL);

int ret = avcodec_send_packet(pDecoderContext, pESPacket);
if (ret < 0)
    continue;

ret = avcodec_receive_frame(pDecoderContext, pFrameDec);
if (ret != 0)
{
    printf("avcodec_receive_frame error: %d\n", ret);
    continue;
}
gettimeofday(&tv3, 0);

u_long twoMinusOne   = (tv2.tv_sec - tv1.tv_sec) * 1000000 + tv2.tv_usec - tv1.tv_usec;
u_long threeMinusTwo = (tv3.tv_sec - tv2.tv_sec) * 1000000 + tv3.tv_usec - tv2.tv_usec;

size_t pktSize = mPacketQueue.getTsPktListSize();
printf("  DECODE ReadFrame %lu usec, DecodeVideo %lu usec. mTsPacketList %u items\n", twoMinusOne, threeMinusTwo, pktSize);

transcodeFrame(pFrameDec);

// Scale and re-encode //
-- call avscale to downsample
-- call avcodec_encode_video2() to encode

Some Output

DECODE ReadFrame 6 usec, DecodeVideo 154273 usec.
Dump mpFrameEnc with DateTime: 
  AVFrame Info frame 720 X 406. PTS = 305700353  PKT_PTS = 305700353 Linesize[0]=720. Linesize[1]=360. Linesize[2]=360.   
Time taken to ENCODE video frame = 3685 usec. Scaling time 4 usec

DECODE ReadFrame 8 usec, DecodeVideo 128203 usec.
Time taken to ENCODE video frame = 3724 usec. Scaling time 3 usec

DECODE ReadFrame 8 usec, DecodeVideo 69321 usec.
Time taken to ENCODE video frame = 3577 usec. Scaling time 3 usec

FFMPEG Version

Tests running on core2 duo 3.2 GHz, 32-bit Centos 6.

bin/ffmpeg
ffmpeg version 3.3.2 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-11)
  configuration: --prefix=/mnt/swdevel/DVStor/source_build/ext/ffmpeg-build --libdir=/mnt/swdevel/DVStor/source_build/ext/ffmpeg-build/lib3p_build --shlibdir=/mnt/swdevel/DVStor/source_build/ext/ffmpeg-build/lib3p_build --disable-static --enable-shared --disable-cuda --disable-cuvid --disable-nvenc --enable-libx264 --enable-gpl --extra-cflags=-I/usr/local/include/libx264
  libavutil      55. 58.100 / 55. 58.100
  libavcodec     57. 89.100 / 57. 89.100
  libavformat    57. 71.100 / 57. 71.100
  libavdevice    57.  6.100 / 57.  6.100
  libavfilter     6. 82.100 /  6. 82.100
  libswscale      4.  6.100 /  4.  6.100
  libswresample   2.  7.100 /  2.  7.100
  libpostproc    54.  5.100 / 54.  5.100
Hyper fast Audio and Video encoder
2
Are you grabbing packets, decoding frames, scaling and encoding all on the same thread?WLGfx
One thread populates a queue with TS packets, that isn't shown in the code. The second thread, using AVIO to feed read_frame, reads frames, decodes, and then encodes. But the real issue is that, when fed a frame, the decoder takes a very long time.Danny
Hello Danny! Right now I'm facing the same problem with decoding slowness of ffmpeg. I'm curious did you find the solution for that problem? If so, please let me know. ThanksEugene Alexeev

2 Answers

1
votes

A possible solution for future reference would be the following:

Enable multithreading for the decoder. Per default the decoder only uses one thread, depending on the decoder, multithreading can speed up decoding drastically.

Assuming you have AVFormatContext *format_ctx, a matching codec AVCodec* codec and AVCodecContext* codec_ctx (allocated using avcodec_alloc_context3).

Before opening the codec context (using avcodec_open2) you can configure multithreading. Check the capabilites of the codec in order to decide which kind of multithreading you can use:

// set codec to automatically determine how many threads suits best for the decoding job
codec_ctx->thread_count = 0;

if (codec->capabilities | AV_CODEC_CAP_FRAME_THREADS)
   codec_ctx->thread_type = FF_THREAD_FRAME;
else if (codec->capabilities | AV_CODEC_CAP_SLICE_THREADS)
   codec_ctx->thread_type = FF_THREAD_SLICE;
else
   codec_ctx->thread_count = 1; //don't use multithreading
0
votes

In case you are still looking for help.

Depending on the pixel format of the video, VLC could be using the GPU to decode it. To be sure you use GPU-Z(only works in windows, there are similar tools for Linux) to measure the Video load Engine. If that's the case, you might need to check if your libs support GPU encoding/decoding.