GZIP Java vs .NET

Question

Using the following Java code to compress/decompress bytes[] to/from GZIP. First text bytes to gzip bytes:

public static byte[] fromByteToGByte(byte[] bytes) {
    ByteArrayOutputStream baos = null;
    try {
        ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
        baos = new ByteArrayOutputStream();
        GZIPOutputStream gzos = new GZIPOutputStream(baos);
        byte[] buffer = new byte[1024];
        int len;
        while((len = bais.read(buffer)) >= 0) {
            gzos.write(buffer, 0, len);
        }
        gzos.close();
        baos.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return(baos.toByteArray());
}

Then the method that goes the other way compressed bytes to uncompressed bytes:

public static byte[] fromGByteToByte(byte[] gbytes) {
    ByteArrayOutputStream baos = null;
    ByteArrayInputStream bais = new ByteArrayInputStream(gbytes);
    try {
        baos = new ByteArrayOutputStream();
        GZIPInputStream gzis = new GZIPInputStream(bais);
        byte[] bytes = new byte[1024];
        int len;
        while((len = gzis.read(bytes)) > 0) {
            baos.write(bytes, 0, len);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return(baos.toByteArray());
}

Think there is any effect since I'm not writing out to a gzip file?
Also I noticed that in the standard C# function that BitConverter reads the first four bytes and then the MemoryStream Write function is called with a start point of 4 and a length of input buffer length - 4. So is that effect the validity of the header?

Jim

Your best bet is to duplicate the two examples you have above in C#, and read and write to and from each platform respectively to see if you have any issues. — Romain Hippeau
Posted this because byte[] compressed using this way in java were giving me "Invalid GZip Header" error when I tried to uncompress it without using the Ionic package. With it decompression and compression worked fine and interchangably between Java and C#. Was trying to understand what the problem was. Jim — Jim Jones

Philip Daubmeier Philip Daubmeier · Accepted Answer · 2010-04-29T18:25:41

I tryed it out, and I cant reproduce your 'Invalid GZip Header' issue. Here is what I did:

Java side

I took your Java compression method together with this java snippet:

public static String ToHexString(byte[] bytes){
    StringBuilder hexString = new StringBuilder();
        for (int i = 0; i < bytes.length; i++)
            hexString.append((i == 0 ? "" : "-") + 
                Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
    return hexString.toString();
}

So that this minimalistic java application, taking the bytes of a test string, compressing it, and converting it to a hex string of the compressed data...:

public static void main(String[] args){
    System.out.println(ToHexString(fromByteToGByte("asdf".getBytes())));
}

... outputs the following (I added annotations):

1f-8b-08-00-00-00-00-00-00-00-4b-2c-4e-49-03-00-bd-f3-29-51-04-00-00-00
^------- GZip Header -------^ ^----------- Compressed data -----------^

C# side

I wrote two methods for compressing and uncompressing a byte array to another byte array (compression method is just for completeness, and my testings):

public static byte[] Compress(byte[] uncompressed)
{
    using (MemoryStream ms = new MemoryStream())
    using (GZipStream gzs = new GZipStream(ms, CompressionMode.Compress))
    {
        gzs.Write(uncompressed, 0, uncompressed.Length);
        gzs.Close();
        return ms.ToArray();
    }
}

public static byte[] Decompress(byte[] compressed)
{
    byte[] buffer = new byte[4096];
    using (MemoryStream ms = new MemoryStream(compressed))
    using (GZipStream gzs = new GZipStream(ms, CompressionMode.Decompress))
    using (MemoryStream uncompressed = new MemoryStream())
    {
        for (int r = -1; r != 0; r = gzs.Read(buffer, 0, buffer.Length))
            if (r > 0) uncompressed.Write(buffer, 0, r);
        return uncompressed.ToArray();
    }
}

Together with a small function that takes a hex string and turns it back to a byte array... (also just for testing purposes):

public static byte[] ToByteArray(string hexString)
{
    hexString = hexString.Replace("-", "");
    int NumberChars = hexString.Length;
    byte[] bytes = new byte[NumberChars / 2];
    for (int i = 0; i < NumberChars; i += 2)
        bytes[i / 2] = Convert.ToByte(hexString.Substring(i, 2), 16);
    return bytes;
}

... I did the following:

// Just hardcoded the output of the java program, convert it back to byte[]
byte[] fromjava = ToByteArray("1f-8b-08-00-00-00-00-00-00-00-" + 
                  "4b-2c-4e-49-03-00-bd-f3-29-51-04-00-00-00");

// Decompress it with my function above
byte[] uncompr = Decompress(fromjava);

// Get the string out of the byte[] and print it
Console.WriteLine(System.Text.ASCIIEncoding.ASCII
                    .GetString(uncompr, 0, uncompr.Length));

Et voila, the output is:

asdf

Works perfect for me. Maybe you should check your decompression method in your c# application.

You said in your previous question you are storing those byte arrays in a database, right? Maybe you want to check whether the bytes come back from the database the way you put them in.

GZIP Java vs .NET

2 Answers

Java side

C# side