6
votes

Is there a way to know if the byte[] has been compressed (or not) by GzipStream .net class?

EDIT: Just want to know if the byte[] array has been compressed (since I will always be using GzipStream to compress and decompress)

3
Where do you get the byte[] from?Mark Byers
@Mark: From WCF runtime. Its actually ArraySegment<byte>stackoverflowuser

3 Answers

8
votes

A GZipStream is a DeflateStream with an additional header and trailer.

The format is specified in RFC 1952.


The .NET 4.0 GZipStream class writes the following bytes as header:

byte[] headerBytes = new byte[] { 0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 4, 0 };
if (compressionLevel == 10)
{
    headerBytes[8] = 2;
}

The trailer consists of a CRC32 checksum and the length of the uncompressed data.

3
votes

Thanks to @dtd's explaination, this works for me: (@stackoverflowuser, you may want this?)

public static class CompressionHelper
{
    public static byte[] GZipHeaderBytes = {0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 4, 0};
    public static byte[] GZipLevel10HeaderBytes = {0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 2, 0};

    public static bool IsPossiblyGZippedBytes(this byte[] a)
    {
        var yes = a.Length > 10;

        if (!yes)
        {
            return false;
        }

        var header = a.SubArray(0, 10);

        return header.SequenceEqual(GZipHeaderBytes) || header.SequenceEqual(GZipLevel10HeaderBytes);
    }
}
1
votes

you could look at the first few bytes for the magic header to see if it is gzipped, but unless the .net compressor writes additional info into one of the comment or other optional fields, you probably can't tell who the compressor was.

http://www.onicos.com/staff/iz/formats/gzip.html

you could also look at the OS type field to see if it was FAT or NTFS, but that still doesn't tell you it was written by C#...