0
votes

I'm solving an exercise in a book about compression, it says if we have the alphabet[A,B,C] and probablities P(A)=0.5,P(B)=0.4,P(C)=0.1 how many bits needed for encoding BBB with huffman and arithmetic.

Now the book itself says in a section "BBB requires 6 bits with huffman but only 4 bits with arithmetic", I have verified that arithmetic needs 4 bits but don't know why huffman needs 6.

I solved huffman like this :
B(3)
P1(3)
EOF(0) B(3)

so B is encoded with one bit only !!

I thought maybe I should include the whole alphabet like this :
A(0),C(0),B(3)
P1(0) B(3)
A(0) C(0)
P2(3)
P1(0) B(3)
A(0) C(0)

still B needs 1 bit(but A and C 2 bits, still no 6 !!).

What am I doing wrong here ?

2

2 Answers

0
votes

I figured it out, because P(B)=0.4 so and we have 3 letters we have 3*0.4=1.2 which is approximately 1 and P(A)=0.5 and 3*0.5=1.5 which is between 1 and 2, and P(C)=0.1 and 0.1*3=0.3 which is approximately 0.

So if we sort them we get :
A(1 or 2),B(1),C(0)
and the tree is :
A(1 or 2) P1(1)
B(1) C(0)

P2(2 or 3)
A(1 or 2) P1(1)
B(1) C(0)

In the end one B requires 2 bits so BBB requires 6 bits to encode, problem solved :) .

0
votes

A has a higher probability than B, so B cannot be assigned fewer bits than A. The Huffman code gives two bits to B, hence six bits for BBB.