how to write image with yuv420 format data with PIL or something like that

Question

I have a video with yuv420p pixel format. At first I tried to read each frame's bytes of it using pipe and pixel format as rgb24. And I used PIL to make image of it. However, the frames read with format of rgb24 seem to lose a little bit of quality.

Here is the command of reading frame with rgb24 pixel format:

    ffmpeg -y -i input.mp4 -vcodec rawvideo -pix_fmt rgb24 -an -r 25 -f rawvideo pipe:1
    frame_data = self.process.stdout.read(1920*1080*3)

Then I tried to read it with yuv420p pixel format.

    ffmpeg -y -i input.mp4 -vcodec rawvideo -pix_fmt yuv420p -an -r 25 -f rawvideo pipe:1
    frame_data = self.process.stdout.read(1920*1080*3/2)

One single frame includes half of the bytes of rgb24 frame. It is 3110400 bytes within a 1920*1080 yuv420p frame. I tossed these data into PIL:

    Image.frombytes('YCbCr', (1920, 1080), frame_data)

but PIL raise an error of not enough image data. I looked up the modes that PIL support to write from bytes, none of it is 12_bit pixels. I also tried to transform the yuv data into rgb data, but it took a lot more time than before when is a long video to process.

Am I doing something wrong? Is there any way to write an image with raw yuv data without any transform??

I find your question very hard to understand. Are you trying to use ffmpeg or Python? What is the actual problem - reading something, (if so, what?), or writing something (if so, what?), or losing quality, or losing speed? Can you share your input file? — Mark Setchell
I'm using ffmpeg and Python to extract frames, do some processing and write these frames into a new file. — nathan wu
At first I extract the frame by setting the pixel format as rgb24. So each time I read 6220800 bytes. It worked out fine but lose quality because the original video was yuv420p. Then I tried to extract frame without setting the pixel format, so each frame should be yuv420p which contains 3110400 bytes. What should I do with these 3110400 bytes of data? I can't use it through PIL... And to transform these data requires a lot of time — nathan wu

Mark Setchell Mark Setchell · Accepted Answer · 2021-04-16T14:42:18

Your YUV420p is chroma sub-sampled and "planar". The sub-sampling means that the U and V channels are each half the width and half the height of the full-resolution Y channel. So, they are each 1/4 of their normal size. So, because it's planar, you will actually receive:

whole Y channel, followed by
1/4 size U channel, followed by
1/4 size V channel

which means. relative to an RGB image, you will have 1 whole and 2 quarter-size channels, i.e. 1.5 channels, which is half what you would have if you had 3 full RGB channels... which is why it takes 12-bits per pixel rather than 24-bits.

PIL doesn't support sub-sampled chroma naturally. So, in order to read your data, you could:

read a full-resolution Y channel into a PIL L mode image
read a h/2 x w/2 resolution U channel into a PIL L mode image, and resize to double
read a h/2 x w/2 resolution V channel into a PIL L mode image, and resize to double

Then merge those three single channel images into 3-channel image.

It its unclear to me why you are using PIL at all though. If you just want to write an un-processed, raw YUV420p stream to disk, let ffmpeg do it itself.

how to write image with yuv420 format data with PIL or something like that

1 Answers