0
votes

I am trying to extract frames out of mp4 videos in order to process them.

Namely there is a watermark / timestamp within the video image which I want to use to automatically stitch the videos together. The Video creation date is not sufficient for this task. enter image description here

Also the part of extracting the text out of the video with AI is fine.

However, FFMPEG seems terribly slow. the source Video is 1080p / 60fps (roughly 1GB per 5 Minutes of video).

I have tried two methods so far using Accord.FFMPEG wrapper:

public void GetVideoFrames(string path)
{
    using (var vFReader = new VideoFileReader())
    {
        // open video file
        vFReader.Open(path);
        // counter is beeing used to extract every xth frame (1 Frame per second)
        int counter = 0;
        for (int i = 0; i < vFReader.FrameCount;i ++)
        {
            counter++;
            if (counter <= 60)
            {
                _ = vFReader.ReadVideoFrame();
                continue;
            }
            else
            {
                Bitmap frame = vFReader.ReadVideoFrame();
                // Process Bitmap
            }
        }
    }
}

The other attempt:

for (int i = 0; i < vFReader.FrameCount;i+= 60)
{
    // notice here, I am specifying which exact frame to extract
    Bitmap frame = vFReader.ReadVideoFrame(i);
    // process frame
}

The second method is what I tried first and it's totally unfeasible. Apparently FFMPEG makes a new seek for each specific frame and thus the operation takes longer and longer for each frame processed. After 5 frames already, it takes roughly 4 seconds to produce one Frame.

The first method at least does not seem to suffer from that issue as heavily but it still takes roughly 2 seconds to yield a frame. At this rate i'm faster to process the video manually.

Is there anything wrong with my approach? Also I rather don't want to have a solution where I need to separately install third party libraries on the target machine. So, if there are any alternatives, I'd be happy to try them out but it seems litterally everyone on stack overflow is either pointing to ffmpeg or opencv.

1
Aren't you pulling every single frame as an image? Why do you expect it to be fast? When you say stitch videos together, what exactly do you mean? - Llama
no. Method 1 (faster) is indeed pulling every single frame, and dumping most of them. Of course its heck slow. That's why I used method 2 first, expecting that ffmpeg would seek from the last known position in the stream. But apparently, it seeks from position 0 every single time. So method 2 (should) not pull every single frame, but its way way slower. - julian bechtold
OK so you'll render them side-by-side or something after syncing them up? - Llama
A couple of years ago, I implemented such a functionality using DirectShow (custom filters written in C++). It was very fast. I think you should switch to native if you need performance. - dymanoid
You can use DirectShow or Media Foundation for that. These are Windows' native technologies. You don't need to bother with decoding video formats and stuff. This will be done by Windows for you. You can access the video frames data, seek through your videos, encode the videos back etc. - dymanoid

1 Answers

0
votes

I think the problem isn't with FFmpeg, but with the Accord wrapper doing the seek. I'd recommend using ffmpeg directly in a single pass to extract the frames, as it has options to extract only keyframes or every X frames (or you can just use the embedded video timestamps...). But if you want to continue with your path then maybe consider passing the desired frame index rather than a for loop - it should be faster, and maybe you can parrallelize it.

But it'll be much faster to do that all in a separate ffmpeg process.