1
votes

I have a video with yuv420p pixel format. At first I tried to read each frame's bytes of it using pipe and pixel format as rgb24. And I used PIL to make image of it. However, the frames read with format of rgb24 seem to lose a little bit of quality.

Here is the command of reading frame with rgb24 pixel format:

    ffmpeg -y -i input.mp4 -vcodec rawvideo -pix_fmt rgb24 -an -r 25 -f rawvideo pipe:1
    frame_data = self.process.stdout.read(1920*1080*3)

Then I tried to read it with yuv420p pixel format.

    ffmpeg -y -i input.mp4 -vcodec rawvideo -pix_fmt yuv420p -an -r 25 -f rawvideo pipe:1
    frame_data = self.process.stdout.read(1920*1080*3/2)

One single frame includes half of the bytes of rgb24 frame. It is 3110400 bytes within a 1920*1080 yuv420p frame. I tossed these data into PIL:

    Image.frombytes('YCbCr', (1920, 1080), frame_data)

but PIL raise an error of not enough image data. I looked up the modes that PIL support to write from bytes, none of it is 12_bit pixels. I also tried to transform the yuv data into rgb data, but it took a lot more time than before when is a long video to process.

Am I doing something wrong? Is there any way to write an image with raw yuv data without any transform??

1
I find your question very hard to understand. Are you trying to use ffmpeg or Python? What is the actual problem - reading something, (if so, what?), or writing something (if so, what?), or losing quality, or losing speed? Can you share your input file?Mark Setchell
I'm using ffmpeg and Python to extract frames, do some processing and write these frames into a new file.nathan wu
At first I extract the frame by setting the pixel format as rgb24. So each time I read 6220800 bytes. It worked out fine but lose quality because the original video was yuv420p. Then I tried to extract frame without setting the pixel format, so each frame should be yuv420p which contains 3110400 bytes. What should I do with these 3110400 bytes of data? I can't use it through PIL... And to transform these data requires a lot of timenathan wu

1 Answers

1
votes

Your YUV420p is chroma sub-sampled and "planar". The sub-sampling means that the U and V channels are each half the width and half the height of the full-resolution Y channel. So, they are each 1/4 of their normal size. So, because it's planar, you will actually receive:

  • whole Y channel, followed by
  • 1/4 size U channel, followed by
  • 1/4 size V channel

which means. relative to an RGB image, you will have 1 whole and 2 quarter-size channels, i.e. 1.5 channels, which is half what you would have if you had 3 full RGB channels... which is why it takes 12-bits per pixel rather than 24-bits.

PIL doesn't support sub-sampled chroma naturally. So, in order to read your data, you could:

  • read a full-resolution Y channel into a PIL L mode image
  • read a h/2 x w/2 resolution U channel into a PIL L mode image, and resize to double
  • read a h/2 x w/2 resolution V channel into a PIL L mode image, and resize to double

Then merge those three single channel images into 3-channel image.

It its unclear to me why you are using PIL at all though. If you just want to write an un-processed, raw YUV420p stream to disk, let ffmpeg do it itself.