0
votes

I'm trying to repeat in .NET the algorithm that was originally written in Java and I'm having troubles with the GZIP decompression.

At the bottom of the post I inserted the hex string that is converted to byte array in both .NET and Java. The resulting byte array is then decompressed in Java with the following method:

public static Object readObjectFromByte(byte[] bytes)
{
ObjectInputStream oos = null;
try {
  ByteArrayInputStream baos = new ByteArrayInputStream(bytes);
  zis = new GZIPInputStream(baos);
  oos = new ObjectInputStream(zis);
  return oos.readObject();
} catch (Throwable t) { GZIPInputStream zis;
  return null;
} finally {
  try {
    if (oos != null) {
      oos.close();
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
}
}

After decompression the resulting byte array has a length of 3952 which is probably correct. At the same time I tried different .NET classes/libs to decompress, but it always gives a byte array of 3979 bytes which is probably incorrect. I tried:

I read a lot of articles about GZIP issues in .NET trying to fix this. I use .NET 4.5, and for example my last decompression version is this:

Ionic.Zlib.GZipStream.UncompressBuffer(compressedBytes)

It's weird but even if I try:

Ionic.Zlib.GZipStream.CompressBuffer(Ionic.Zlib.GZipStream.UncompressBuffer(compressedBytes)).SequenceEquals(compressedBytes)

It gives me FALSE.

The hex string:

EDIT:

Java Code:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;

public class JavaFiddle
{
public static void main(String[] args)
{
  String hex = "PLEASE_UPDATE"; //update this from the hex constant at the end of the post
  byte[] compressedBytes = hexStringToByteArray(hex);
  byte[] decompressedBytes = (byte[])readObjectFromByte(compressedBytes);
  System.out.println(decompressedBytes.length); //THIS GIVES 3952
}

public static byte[] hexStringToByteArray(String s) {
    int len = s.length();
    byte[] data = new byte[len / 2];
    for (int i = 0; i < len; i += 2) {
        data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                             + Character.digit(s.charAt(i+1), 16));
    }
    return data;
}

public static Object readObjectFromByte(byte[] bytes)
{
    ObjectInputStream oos = null;
    try {
      ByteArrayInputStream baos = new ByteArrayInputStream(bytes);
      GZIPInputStream zis = new GZIPInputStream(baos);
      oos = new ObjectInputStream(zis);
      return oos.readObject();
    } catch (Throwable t) { GZIPInputStream zis;
      return null;
    } finally {
      try {
        if (oos != null) {
          oos.close();
        }
      } catch (IOException e) {
        e.printStackTrace();
      }
    }
}    
}

.NET Code

    private byte[] StringToByteArray(string hex)
    {
        int NumberChars = hex.Length;
        byte[] bytes = new byte[NumberChars / 2];
        for (int i = 0; i < NumberChars; i += 2)
            bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
        return bytes;
    }
    ...
    var hex = "PLEASE_UPDATE"; //update this from the hex constant at the end of the post
    var compressedBytes = StringToByteArray(hex);
    var decompressedBytes = Ionic.Zlib.GZipStream.UncompressBuffer(compressedBytes); 
    //decompressedBytes.Length is 3979, Note that this is using one of the external libraries, the same result is for built-in GZipStream in .NET



Thanks,

1
Rather than going for something that is "probably correct", it would help if you'd start off with known initial data, compress it, then decompress it in both Java and .NET. It doesn't help that you haven't shown any of the attempts in .NET. Ideally, provide a minimal reproducible example - are you able to demonstrate this with a very small piece of data, that could be easily hard-coded?Jon Skeet
Thanks, unfortunately I don't have control over Java source code, I only saw how this produced a different result and I was given that only HEX string. I will try to play with it and will update the post later. As for .NET example, I mentioned about my last decompression try: Ionic.Zlib.GZipStream.UncompressBuffer(compressedBytes) I will also update my post to include the code how I convert HEX string to byte array.Ihor Deyneka
It's not particularly odd that if you use one gzip library to decompress and recompress, you get different results, btw. There are various tweaks that could change the compression results.Jon Skeet
Updated post, please see EDIT sectionIhor Deyneka
I have done several interop projects between .NET and Java systems, and there are often byte order mark differences... I'd start with that if I were in your situation.Brian Driscoll

1 Answers

2
votes

Now we've got more of the Java code, we can see the problem: you've got an extra layer of serialization around your real data. That has nothing to do with compression really.

Here's an example to show what I mean:

import java.io.*;

public class Test {
    public static void main(String[] args) throws Exception {
        try (ByteArrayOutputStream output = new ByteArrayOutputStream()) {
            try (ObjectOutputStream oos = new ObjectOutputStream(output)) {
                oos.writeObject(new byte[5]);
            }
            byte[] data = output.toByteArray();
            System.out.println(data.length);
        }
    }
}

That's writing a byte array that's 10 bytes long - but the result is 32 bytes long, because of the extra "wrapper" information. Note that the extra 27 bytes is the same as the discrepancy you've seen.

Fundamentally, it's odd to wrap a byte array in this way, and if you can possibly change the original code, that would be for the best. If you absolutely can't do that, it may be safe to just ignore the first 27 bytes of the resulting data.