I'm trying to understand how php gzdeflate/gzinflate works. if I understand it's a simple bitstream with a three bit header. So, it would be possible to get only the first bits and see what is compressed. So by taking first bits, we could theoretically (?) extract bytes after bytes?
In fact I have lost some bits of a file (around the first 40-50 bytes with some missing bits, not all bits, just some of them). I just want to know if I can do an intelligent "bruteforce" in order to recreate the first bytes which can fully gzdeflate() the file. I know it's php code, so extracted bytes should be ASCII only. I tried to bruteforce all bits, but it's too long. So, if I could bruteforce bits after bits, it would be preferable.
(If there is a reimplementation in python which is readable, that would help me a lot). I've read http://www.zlib.net/feldspar.html which is more on how to compress data, I want to uncompress it.
Thank you
EDIT: Let's take an example. Here is my data (in hex):
39e0 6fb2 41eb ....
This data is substracted from a key. This key is used in its hexadecimal format, so I have only 16 possibilities for each char.
Algo is this one:
(ciphered - key) % 256 = deflate_data
Key is in hex form. So first byte key can only be: 0x30 -> 0x39 and 0x61 -> 0x66 I have only choices (in little endian for clarity, first bit is last block, two next bits are encoding type):
Key -> deflate
0 -> 10010000 --> No, if .00 code, all other bits must be 00
1 -> 00010000 --> No, if .00 code, all other bits must be 00
2 -> 11100000 --> No, .11 is reserved
3 -> 01100000 --> No, .11 is reserved
4 -> 10100000 --> Maybe?
5 -> 00100000 --> Maybe?
6 -> 11000000 --> Maybe?
7 -> 01000000 --> Maybe?
8 -> 10000000 --> Uncompressed? Must check LEN and NLEN
9 -> 00000000 --> Uncompressed? Must check LEN and NLEN
a -> 00011011 --> No, .00 should hav all others bits to 0
b -> 11101011 --> No, .11 is reserved
c -> 01101011 --> No, .11 is reserved
d -> 10101011 --> Maybe?
e -> 00101011 --> Maybe?
f -> 11001011 --> Maybe?
so the first byte of my key could be: 4,5,6,7 or d,e,f . Some of them used fixed dictionary. So is it theorically possible to try next bytes? Other bytes are dynamic dictionary. So is it possible to create a huffman tree with next bytes? Wrong key will likely produce impossible huffman trees. When I have few possibilities left, I could try to bruteforce remaining keys.
8 and 9 can be tested easily.
The key is constructed like this:
MD5(pass[::-1])+MD5(pass[:len(pass)])
So, theorically, key can be 32 to 50-60 chars, depending on the length of the key.