would like to generate text files for frames extracted with ffmpeg, containing subtitle of the frame if any, on a video for which I have burn the subtitles using ffmpeg also.
I use a python script with pysrt
to open the subrip file and generate the text files.
What I am doing is that each frames is named with the frame number by ffmpeg, then and since they are extracted at a constant rate, I can easily retrieve the time position of the frame using the formula t1 = fnum/fps
, where fnum
is the number of the frame retrieved with the filename, and fps
is the frequency passed to ffmpeg for the frame extraction.
Even though I am using the same subtitle file to retrieve the text positions in the timeline, that the one that has been used in the video, I still get accuracy errors. Most I have some text files missing or some that shouldn't be present.
Because time is not really continuous when talking about frames, I have tried recalibrating t
using the fps of the video wih the hardcoded subtitles, let's call that fps vfps
for video fps (I have ensured that the video fps is the same before and after subtitle burning). I get the formula: t2 = int(t1*vfps)/vfps
.
It still is not 100% accurate.
For example, my video is at 30fps (vfps=30
) and I extracted frames at 4fps (fps=4
).
The extracted frame 166 (fnum=166
) shows no subtitle. In the subrip file, the previous subtitle ends at t_prev=41.330
and the next subtitle begins at t_next=41.400
, which means that t_sub
should satisfy: t_prev < t_sub and t_sub < t_next
, but I can't make this happen.
Formulas I have tried:
t1 = fnum/fps # 41.5 > t_next
t2 = int(fnum*vfps/fps)/vfps # 41.5 > t_next
# is it because of a indexing problem? No:
t3 = (fnum-1)/fps # 41.25 < t_prev
t4 = int((fnum-1)*vfps/fps)/vfps # 41.23333333 < t_prev
t5 = int(fnum*vfps/fps - 1)/vfps # 41.466666 > t_next
t6 = int((fnum-1)*vfps/fps + 1)/vfps # 41.26666 < t_prev
Command used:
# burning subtitles
# (previously)
# ffmpeg -r 25 -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4
# now:
ffmpeg -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4
# frames extraction
ffmpeg -i withsub.mp4 -vf fps=4 extracted/%05.bmp -hide_banner
Why does this happen and how can I solve this?
One thing I have noticed is that if I extract frames of the original video and the subtitle ones, do a difference of the frames, the result is not only the subtitles, there are variations in the background (that shouldn't happen). If I do the same experience using the same video two times, the difference is null, which means that the frame extraction is consistant.
Code for the difference:
ffmpeg -i withsub.mp4 -vf fps=4 extracted/%05.bmp -hide_banner
ffmpeg -i no_sub.mp4 -vf fps=4 extracted_no_sub/%05.bmp -hide_banner
for img in no_sub/*.bmp; do
convert extracted/${img##*/} $img -compose minus -composite diff/${img##*/}
done
Thanks.
ffmpeg -i nosub.mp4 -vf subtitles=sub.srt withsub.mp4
but it changed the rate from 25 to 30, so I manually set the frame rate and it seems that it's not the right to do it. I am not a big user of ffmpeg. I googled for how to preserve framerate with ffmpeg and found this superuser.com/questions/460332/… but it has no accepted answer. – Nick Skywalkert_prev*fps=165.32
andt_next*fps=165.6
, which implies that, if the frame extracted by ffmpeg are factors of1/fps
then I shouldn't get frame 166 to be inbetween the two subtitles but displaying the second one instead (or the previous maybe). Same thing if correcting by the video fps:int(t_next*vfps)*fps/vfps=165.2
,int(t_next*vfps)*fps/vfps=165.6
– Nick Skywalker