I am developing a file compressor program. We are currently implementing .ZIP archiver standart, so that when generate a compressed .ZIP archiver any other reputable compressor (such as 7zip) can perfectly understand/uncompres it.
We are now developing the DEFLATE algorithm based on RFC 1951
We have a variant of LZ77 and the Huffman coding with fixed codes working perfectly and compatible with the RFC, thus working with Literal-Length + Distance values.
On the Dynamic Huffman Coding I am currently able to extract the Huffman trees from some compressed data (compressed via another reliable compressor), but when it's time to start decompressing the real data I get incorrect values.
Possibly I'm reading the trees in a wrong way.
I have not specificly found any place where someone explains with exactitude the way the values of these trees are stored on the compressed data.
I assume the encoded data follows the same literal-length values (0~285) + distance (0~30) with its corresponding extra bits per literal / distance as explained in the RFC the same way fixed huffman encoding does.
The way this is stored on fixed huffman encoding is that Huffman Codes are stored with the most significant bit of the code on the least significant bit in memory. This way you are able to navigate down the encoding tree reading bit by bit.
Extra bits of the Huffman code are stored the other way instead .
Does Dynamic Huffman Coding store them the same way?
Is there something I am missing or that I should be aware?