0
votes

I want to decompress a string in java which was gzip compressed and encoded as base64 in python.

What I want to do is to perform a gzip compression on a string in python and I have to decompress that compressed string in java.

First gzip compress the string 'hello'+'\r\n'+'world' using gzip module in python and then encode that compressed string to base64 in python. The output I get for this is H4sIAM7yqVcC/8tIzcnJ5+Uqzy/KSQEAQmZWMAwAAAA=

Then I use the encoded compressed string from python in java to gzip decompress that string. For that I fisrt perform base64 decode on that string in java using DatatypeConverter.parseBase64Binary which will give a byte array and then I perform gzip decompression on that byte array using GZIPInputStream. But the decompressed output in java is shown as helloworld.

I had a '\r\n' in the compressed string in python but it is not shown in decompressed output. I think the problem here is in base64 encode and decode performed on that string. Please help me to solve this problem.

String used:

string = 'hello'+'\r\n'+'world'

Expected output in java:

hello
world

Output got:

helloworld

This is the gzip compression code in python:

String ='hello'+'\r\n'+'world'

out = StringIO.StringIO()

with gzip.GzipFile(fileobj=out, mode="w") as f:

    f.write(o)

f=open('compressed_string','wb')

out.getvalue()

f.write(base64.b64encode(out.getvalue()))

f.close()

This is the gzip decompression code in java:

BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("compressed_string")));

try {

while((nextLine=reader.readLine())!=null)

{

    compressedStr +=nextLine;                                    

}

finally
{

  reader.close();
}

}

byte[] compressed = DatatypeConverter.parseBase64Binary(compressedStr);

decomp = decompress(compressed);

This is gzip decompression method in java:

public static String decompress(final byte[] compressed) throws IOException {

    String outStr = "";

    if ((compressed == null) || (compressed.length == 0)) {

        return "";

    }

    if (isCompressed(compressed)) {

        GZIPInputStream gis = new GZIPInputStream(new 

ByteArrayInputStream(compressed));

        BufferedReader bufferedReader = new BufferedReader(new 

InputStreamReader(gis, "UTF-8"));

        String line;

        while ((line = bufferedReader.readLine()) != null) {

            outStr += line;

        }

    } else {

        outStr = new String(compressed);

    }

    return outStr;

}
1
The "InputStreamReader" from the "gis"-Stream contaions the correct value. You consume the CRLF while you read the string line by line.Konrad

1 Answers

0
votes

Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.

Returns:

A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

bufferedReader.readLine() reads by line

so you need to add '\r\n' when you append the string

outStr += line + "\r\n";

but you should use StringBuilder