I'm trying to figure out how to prove that the Lempel ZIV 77 algorithm for Compression is really gives optimal compression.
I found the following info:
So how well does the Lempel-Ziv algorithm work? In these notes, we’ll
calculate two quantities. First, how well it works in the worst case, and
second, how well it works in the random case where each letter of the message
is chosen uniformly and independently from a probability distribution, where
the ith letter appears with probability pi
. In both cases, the compression
is asymptotically optimal. That is, in the worst case, the length of the
encoded string of bits is n + o(n). Since there is no way to compress all
length-n strings to fewer than n bits, this can be counted as asymptotically
optimal. In the second case, the source is compressed to length
α
H(p1, p2, . . . , pα)n + o(n) = n∑(-pi log2 pi) + O(n)
i=1
which is to first order the Shannon bound.
What is meant here? And why there is no way to compress alllength-n strings to fewer than n bits?
Thanks all.