DEFLATE method reasoning

Question

Why does LZ77 DEFLATE use Huffman encoding for it's second pass instead of LZW? Is there something about their combination that is optimal? If so, what is the nature of the output of LZ77 that makes it more suitable for Huffman compression than LZW or some other method entirely?

They could have gone for a range coder as the backend (but it's slower and it would be a bit annoying to put those extension bits inside the bitstream), or today probably ANS. — harold

Mark Adler Mark Adler · Accepted Answer · 2016-09-29T01:34:13

LZW tries to take advantage of repeated strings, just like the first "stage" as you call it of LZ77. It then does a poor job of entropy coding that information. LZW has been completely supplanted by more modern approaches. (Except for its legacy use in the GIF format.) Once LZ77 generates a list of literals and matches, there is nothing left for LZW to take advantage of, and it would then make an almost completely ineffective entropy coder for that information.

DEFLATE method reasoning

2 Answers