FFMPEG:av_rescale_q - time_base difference

Question

I want to know once and for all, how time base calucaltion and rescaling works in FFMPEG. Before getting to this question I did some research and found many controversial answers, which make it even more confusing. So based on official FFMPEG examples one has to

rescale output packet timestamp values from codec to stream timebase

with something like this:

pkt->pts = av_rescale_q_rnd(pkt->pts, *time_base, st->time_base, AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX);
pkt->dts = av_rescale_q_rnd(pkt->dts, *time_base, st->time_base, AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX);
pkt->duration = av_rescale_q(pkt->duration, *time_base, st->time_base);

But in this question a guy was asking similar question to mine, and he gave more examples, each of them doing it differently. And contrary to the answer which says that all those ways are fine, for me only the following approach works:

frame->pts += av_rescale_q(1, video_st->codec->time_base, video_st->time_base);

In my application I am generating video packets (h264) at 60 fps outside FFMPEG API then write them into mp4 container.

I set explicitly:

video_st->time_base = {1,60};
video_st->r_frame_rate = {60,1};
video_st->codec->time_base = {1 ,60};

The first weird thing I see happens right after I have written header for the output format context:

AVDictionary *opts = nullptr;
int ret = avformat_write_header(mOutputFormatContext, &opts);
av_dict_free(&opts);

After that ,video_st->time_baseis populated with:

num = 1;
den = 15360

And I fail to understand why.

I want someone please to exaplain me that.Next, before writing frame I calculate PTS for the packet. In my case PTS = DTS as I don't use B-frames at all.

And I have to do this:

 const int64_t duration = av_rescale_q(1, video_st->codec->time_base, video_st->time_base);
 totalPTS += duration; //totalPTS is global variable
 packet->pts = totalPTS ;
 packet->dts = totalPTS ;
 av_write_frame(mOutputFormatContext, mpacket);

I don't get it,why codec and stream have different time_base values even though I explicitly set those to be the same. And because I see across all the examples that av_rescale_q is always used to calculate duration I really want someone to explain this point.

Additionally, as a comparison, and for the sake of experiment, I decided to try writing stream for WEBM container. So I don't use libav output stream at all. I just grab the same packet I use to encode MP4 and write it manually into EBML stream. In this case I calculate duration like this:

 const int64_t duration =
 ( video_st->codec->time_base.num / video_st->codec->time_base.den) * 1000;

Multiplication by 1000 is required for WEBM as the time stamps are presented in milliseconds in that container.And this works. So why in case of MP4 stream encoding there is a difference in time_base which has to be rescaled?

aggieNick02 aggieNick02 · Accepted Answer · 2020-12-02T20:07:01

This behavior from ffmpeg confuses me too. It was discussed a little by users here - http://ffmpeg.org/pipermail/libav-user/2018-January/010843.html . But the resolution there was to just deal with the 15360 time_base rather than exert control over it.

From the source pointed out by the poster in that forum topic (https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/movenc.c, search for "*= 2"), it doesn't look easily avoidable as far as I can tell. It appears your choice is to let the time_base get changed, or to pick something >= 10000 and then it will not be changed.

FFMPEG:av_rescale_q - time_base difference

1 Answers