0
votes

I'm trying to decompress buffer compressed by php deflate implementation. Here's the code:

    public static void CopyTo(Stream src, Stream dest)
    {
        byte[] bytes = new byte[4096];

        int cnt, i = 0;

        while ((cnt = src.Read(bytes, 0, bytes.Length)) != 0 )
        {
            dest.Write(bytes, 0, cnt);
        }
        dest.Flush();
    }

    public static byte[] Unzip(byte[] bytes)
    {
        using (var msi = new MemoryStream(bytes))
        using (var mso = new MemoryStream())
        {
            using (var gs = new DeflateStream(msi, CompressionMode.Decompress))
            {

                msi.ReadByte();
                msi.ReadByte();
                CopyTo(gs, mso);
            }

            return mso.ToArray();
        }
    }

As you notice, I'm reading first 2 bytes from source stream, otherwise DeflateStream throws exception saying invalid block size. However, my problem is that, for some files, this code works like a charm, but for others, it gives corrupted result (a file with only some part of the file. Gives impression that it didn't decompress whole file). Anyone has any idea what's wrong?

Thanks

UPDATE

I found out PHP function used to compress the data. It's gzcompress.

1
It sounds like it's quite possibly a problem on the PHP side to be honest.Jon Skeet

1 Answers

1
votes

You didn't say what php function you used, but I'm guessing gzcompress(). That produces the zlib format which is the raw deflate format with a zlib header and trailer wrapped around it, whereas DeflateStream is expecting raw deflate with no header or trailer. That's why you're having to skip the first two bytes, which is the zlib header.

The PHP function names are terrible and confusing, and the documentation doesn't help much. There are three formats in play here: raw deflate, gzip-wrapped deflate, and zlib-wrapped deflate. All of the PHP functions start with gz but only some of them actually process the gzip format.

The sometimes working and sometimes not could be due to end-of-line or other text conversions. Make sure that you are reading the actual bytes in the file without corruption.