1
votes

I am trying to implement the "zlib.h" deflate and inflate functions to compress and decompress streams in PDF-file. Input: compressed stream from PDF-file. I implemented inflate function -- it's all right, I have uncopressed stream, after that I try to compress this stream again with deflate function, as output I have compressed stream, but it is not equal to input compressed stream and they are not equal to the length. What I'm doing wrong? This is a part of my code:

     size_t outsize = (streamend - streamstart) * 10;
            char* output = new char[outsize]; ZeroMemory(output, outsize);

            z_stream zstrm; ZeroMemory(&zstrm, sizeof(zstrm));
            zstrm.avail_in = streamend - streamstart + 1;
            zstrm.avail_out = outsize;
            zstrm.next_in = (Bytef*)(buffer + streamstart);//block of date to infalte 
            zstrm.next_out = (Bytef*)output; 

            int rsti = inflateInit(&zstrm);
            if (rsti == Z_OK)
            {
                int rst2 = inflate(&zstrm, Z_FINISH);
                if (rst2 >= 0)
                {
                    cout << output << endl;//inflated data
                }
            }

            char* deflate_output = new char[streamend - streamstart];           
            ZeroMemory(deflate_output, streamend - streamstart);
            z_stream d_zstrm; ZeroMemory(&d_zstrm, sizeof(d_zstrm));

            d_zstrm.avail_in = (uInt) (strlen(output)+1);
            d_zstrm.avail_out = (uInt) (streamend - streamstart);
            d_zstrm.next_in = (Bytef*)(output);
            d_zstrm.next_out = (Bytef*)(deflate_output);
            int rsti1 = deflateInit(&d_zstrm, Z_DEFAULT_COMPRESSION);

            if (rsti1 == Z_OK)
            {
                int rst22 = deflate(&d_zstrm, Z_FINISH);
                out << deflate_output << endl << "**********************" << endl;
//I try to write deflated stream to file
                printf("New size of stream: %lu\n", (char*)d_zstrm.next_out - deflate_output);
            }
1
But does it then decompress properly? If someone else compressed the data in the first place, it won't necessarily compress to the same thing, because there is more than one way to compress using the pattern matching method.Weather Vane
@WeatherVane, yes, decompression is correct on 100%, after decompression streams contain initial structure if pdf-file.Diana
Bear in mind too, that there can be different levels of compression - trading speed for size.Weather Vane
@WeatherVane Yes, sure. I tried different compression levels, but no one gave the same result :(Diana
Because it still follows the rules. It is lossless coding. Suppose you want to compress "aaaa123456789aaa". There are two possible pattern matches for the final "aaa" and either is correct.Weather Vane

1 Answers

3
votes

There is nothing wrong. There is not a unique compressed stream for a given uncompressed stream. All that is required is that the decompression give you back exactly what was compressed (hence "lossless").

It may simply be caused by different compression parameters, different compression code, or even a different version of the same compression code.

If you can't reproduce the original compressed data, so what? All that matters is that you can make a valid PDF file that can be decompressed and has the content that you want.