How to map frame extracted with ffmpeg and subtitle of a video? (frame accuracy problem)

Question

would like to generate text files for frames extracted with ffmpeg, containing subtitle of the frame if any, on a video for which I have burn the subtitles using ffmpeg also.

I use a python script with pysrt to open the subrip file and generate the text files. What I am doing is that each frames is named with the frame number by ffmpeg, then and since they are extracted at a constant rate, I can easily retrieve the time position of the frame using the formula t1 = fnum/fps, where fnum is the number of the frame retrieved with the filename, and fps is the frequency passed to ffmpeg for the frame extraction.

Even though I am using the same subtitle file to retrieve the text positions in the timeline, that the one that has been used in the video, I still get accuracy errors. Most I have some text files missing or some that shouldn't be present.

Because time is not really continuous when talking about frames, I have tried recalibrating t using the fps of the video wih the hardcoded subtitles, let's call that fps vfps for video fps (I have ensured that the video fps is the same before and after subtitle burning). I get the formula: t2 = int(t1*vfps)/vfps. It still is not 100% accurate.

For example, my video is at 30fps (vfps=30) and I extracted frames at 4fps (fps=4). The extracted frame 166 (fnum=166) shows no subtitle. In the subrip file, the previous subtitle ends at t_prev=41.330 and the next subtitle begins at t_next=41.400, which means that t_sub should satisfy: t_prev < t_sub and t_sub < t_next, but I can't make this happen.

Formulas I have tried:

t1 = fnum/fps  # 41.5 > t_next
t2 = int(fnum*vfps/fps)/vfps  # 41.5 > t_next
# is it because of a indexing problem? No:
t3 = (fnum-1)/fps  # 41.25 < t_prev
t4 = int((fnum-1)*vfps/fps)/vfps  # 41.23333333 < t_prev
t5 = int(fnum*vfps/fps - 1)/vfps  # 41.466666 > t_next
t6 = int((fnum-1)*vfps/fps + 1)/vfps  # 41.26666 < t_prev

Command used:

# burning subtitles
# (previously)
# ffmpeg -r 25 -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4
# now:
ffmpeg -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4
# frames extraction
ffmpeg -i withsub.mp4 -vf fps=4 extracted/%05.bmp -hide_banner

Why does this happen and how can I solve this?

One thing I have noticed is that if I extract frames of the original video and the subtitle ones, do a difference of the frames, the result is not only the subtitles, there are variations in the background (that shouldn't happen). If I do the same experience using the same video two times, the difference is null, which means that the frame extraction is consistant.

Code for the difference:

ffmpeg -i withsub.mp4 -vf fps=4 extracted/%05.bmp -hide_banner
ffmpeg -i no_sub.mp4 -vf fps=4 extracted_no_sub/%05.bmp -hide_banner
for img in no_sub/*.bmp; do
    convert extracted/${img##*/} $img -compose minus -composite diff/${img##*/}
done

Thanks.

ffmpeg -r 25 -i nosub.mp4 --> this will retime frames and destroy original timestamps. You don't want to do this unless you know what you're doing. — Gyan
Not really... I wanted to burn the subtitles without changing the framerate. So I tried this first: ffmpeg -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4 but it changed the rate from 25 to 30, so I manually set the frame rate and it seems that it's not the right to do it. I am not a big user of ffmpeg. I googled for how to preserve framerate with ffmpeg and found this superuser.com/questions/460332/… but it has no accepted answer. — Nick Skywalker
My bad, the original fps was 30 so my first attempt to burn the subtitles did not change the frame rate. The problems remain however. And doing a frame substraction before and after subtitle burning shows more differences than the subtitle. Please let me remind that this is not the problem I am trying to solve, but may be the cause of my problem. What log would you like to see? — Nick Skywalker
It's also worth noticing that, in the example, the boundaries of the interval multiplied mulitplied by the frame extraction gives: t_prev*fps=165.32 and t_next*fps=165.6, which implies that, if the frame extracted by ffmpeg are factors of 1/fps then I shouldn't get frame 166 to be inbetween the two subtitles but displaying the second one instead (or the previous maybe). Same thing if correcting by the video fps: int(t_next*vfps)*fps/vfps=165.2, int(t_next*vfps)*fps/vfps=165.6 — Nick Skywalker

Gyan Gyan · Accepted Answer · 2019-11-14T16:12:53

You can extract frames with accurate timestamps, thus

ffmpeg -i nosub.mp4 -vf subtitles=sub.srt,settb=AVTB,select='if(eq(n\,0)\,1\,floor(4*t)-floor(4*prev_t))' -vsync 0 -r 1000 -frame_pts true extracted/%08d.bmp

This will extract the first frame from each quarter second. The output filename is 8 characters long where the first 5 digits are seconds and last three are milliseconds. You can change the field size based on max file duration.

How to map frame extracted with ffmpeg and subtitle of a video? (frame accuracy problem)

1 Answers