1
votes

I have a 512x512 image and I tried to recompress it. Here's the steps for recompressing an image to jpeg file

    1) convert rgb to YCrCb
    2) perform down sampling on Cr and Cb
    2) convert YCrCb to DCT and Quantized according to chosen Quality
    3) perform Huffman Encoding on Quantized DCT

But before Huffman Encoding I counted the number of DCT coefficients and it is 393216. Dividing by it by 64 tells me the number of DCT block (8x8) which will be 6144.

Now I tried to count the number of 8x8 blocks for pixel domain. 512/8=64 which gives me 64 blocks horizontally and 64 blocks vertically. 64 x 64 = 4096 which is not equal to number of DCT blocks while the number of pixels are 512x512 = 262144

My Question is how does Huffman encoding magically transform 393216 coefficients to 262144 pixels and get each pixel values, and compute the dimension (512x512) of the compressed image(jpeg).

Thanks you in advance. :D

3
Did you actually do the Hoffman encoding manually, or are you using some library?Sam I am says Reinstate Monica
I used a library for the whole compression processFrank Smith

3 Answers

2
votes

If your image was encoded with no color subsampling, then there would be a 1:1 ratio of 8x8 coefficient blocks to 8x8 color component blocks. Each MCU (minimum coded unit) would be 8x8 pixels and have 3 8x8 coefficient blocks. 512x512 pixels = 64x64 8x8 blocks x 3 (one each for Y, Cr and Cb) = 12288 coefficient blocks.

Since you said you subsampled the color (I assume in both directions), then you will now have 6 8x8 blocks for each MCU. In the diagram below, the leftmost diagram shows the case for no subsampling of the colors and the rightmost diagram shows subsampling in both directions. The MCU size in this case will be 16x16 pixels. Each 16x16 block of pixels will need 6 8x8 coefficient blocks to define it (4 Y, 1 Cr, 1 Cb). If you divide the image into 16x16 MCUs, you will have 32x32 MCUs each with 6 8x8 blocks per MCU = 6144 coefficient blocks. So, to answer your question, the Huffman encoding is not what's changing the number of coefficients, it's the color subsampling. Part of the compression which comes from using color subsampling in JPEG images is exploiting a feature of the human visual system. Our eyes are more sensitive to changes in luminance than chrominance.

enter image description here

0
votes

Huffman encoding doesn't transform coefficients to pixels or anything like that. At least not the Huffman encoding that I'm thinking of. All huffman encoding does, is it takes a list of tokens, and represents them with less bits based on the frequency of those tokens.

an example: you have tokens a, b, c, and d

now, uncompressed, each of your tokens would require 2 bits(00, 01, 10, and 11).

let's say a=00, b=01, c=10, and d=11

aabaccda would be represented as 0000010010101100 16 bits

but with Huffman encoding you'd represent a with less bits because it's more common, and you'd represent b and d with more because they're less common something to the extent of:

a=0, b=110, c=10, d=111 and then

aabaccda would be represented as 00110010101110 14 bits

0
votes

Your image is 512x512 pixels The Y component is 512x512 hence 262144 pixels turned into 262144 DCT coefficients The Cb and Cr components are downsampled by 2 hence 256x256 pixels turned into 65536 DCT coefficients each. The sum of all DCT coefficients is 262144+65536+65536 = 393216. Huffman has nothing to do with this.