I need to decompress some zlib compressed files found within a game's save data. I have no access to the game's source. Each file begins with 0x789C
which tells me that they are indeed compressed with zlib. However, all calls to inflate on these files fail to decompress fully and return Z_DATA_ERROR
. Using zlib version 1.2.5, 1.2.8, and 1.2.11 with identical results.
Even though zlib is telling me the input data is corrupt, I'm confident that it is not since the game is able to decompress these files with no issues AND this is not isolated to a single data stream. I have hundreds of thousands of unique data streams compressed the same way and they all throw a Z_DATA_ERROR
somewhere in the middle of the decompression.
Furthermore, the partially decompressed data that IS successfully decompressed, is correct. The output is exactly as expected.
Also, about 10% of the time, zlib WILL decompress the entire file, however the result is not correct. Large chunks of the decompressed data contain the same byte repeated over and over, which tells me it was a false positive.
Here's my decompression code:
//QByteArray is a Qt wrapper for a char *
QByteArray Compression::DecompressData(QByteArray data)
{
QByteArray result;
int ret;
z_stream strm;
static const int CHUNK_SIZE = 1;//set to 1 just for debugging
char out[CHUNK_SIZE];
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = data.size();
strm.next_in = (Bytef*)(data.data());
ret = inflateInit2(&strm, -15);
if (ret != Z_OK)
{
qDebug() << "init error" << ret;
return QByteArray();
}
do
{
strm.avail_out = CHUNK_SIZE;
strm.next_out = (Bytef*)(out);
ret = inflate(&strm, Z_NO_FLUSH);
qDebug() << "debugging output: " << ret << QString::number(strm.total_in, 16);//This tells me which input byte caused the failure
Q_ASSERT(ret != Z_STREAM_ERROR);
switch (ret)
{
case Z_NEED_DICT:
ret = Z_DATA_ERROR;
case Z_DATA_ERROR:
case Z_MEM_ERROR:
(void)inflateEnd(&strm);
return result;
}
result.append(out, CHUNK_SIZE - strm.avail_out);
} while (strm.avail_out == 0);
inflateEnd(&strm);
return result;
}
Here is a pastebin of an example file's data compressed data with the 0x789C
and trailing CRC removed. I can supply literally endless example files. All of them have the same issue.
Running that data through the above function will decompress the beginning of the stream correctly, but fail on input byte 0x18C
. You can tell it decompressed correctly when the start of the file begins with 0x000B
and the decompressed data is longer than the input data.
I wish I knew more about deflate compression to solve this problem myself. My initial thoughts are that the game has decided to use a custom version of zlib or an extra parameter needs to be given to zlib in order to decompress it correctly. I've asked around and tried many things for days, and I really need someone with knowledge on the subject to weigh in here. Thanks for your time!