2
votes

I have a data buffer which contains multiple compressed members, it could be deflate or zlib compressed member.

I found that zlib inflate call returns Z_STREAM_END after processing the first compressed block, Here multiple compressed member can be in any number(here in my example Its 3). But this data comes from other sides which doesn't communicated detail about number of compressed member in a data.

So how could I implement the use of zlib inflate functionality so that it could work over multiple compressed member ?

Following is a sample quick & dirty example in which I try to elaborate my problem. This referred the case with zlib 1.2.5 library.

/* example.c -- understanding zlib inflate/decompression operation
 */

#define CHECK_ERR(err, msg) { \
    if (err != Z_OK) { \
        std::cerr << msg << " error: " << err << std::endl; \
        exit(1); \
    } \
}

/* ===========================================================================
 * deflate() to create compressed data
 */
void test_deflate(std::vector<uint8_t> & input_data, std::vector<uint8_t>& compr)
{
    z_stream c_stream; /* compression stream */
    int err;

    compr.clear();

    c_stream.zalloc = (alloc_func)0;
    c_stream.zfree = (free_func)0;
    c_stream.opaque = (voidpf)0;

    err = deflateInit(&c_stream, Z_DEFAULT_COMPRESSION);
    CHECK_ERR(err, "deflateInit");

    c_stream.next_in  = &input_data[0];
    c_stream.avail_in = input_data.size();

    for (;;) {
        uint8_t c_buffer[10] = {};
        c_stream.next_out  = &c_buffer[0];
        c_stream.avail_out = 10;

        err = deflate(&c_stream, Z_FINISH);
        if (err == Z_STREAM_END)
        {
            for (int i = 0; i < (10 - c_stream.avail_out); i++)
                compr.push_back(c_buffer[i]);
            break;
        }
        CHECK_ERR(err, "deflate");
        for (int i = 0; i < (10 - c_stream.avail_out); i++)
            compr.push_back(c_buffer[i]);
    }

    std::cout << "Compressed data (size = " << std::dec << compr.size() << ") = ";
    for (int i = 0; i < compr.size(); i++)
        std::cout << (uint32_t) compr[i];
    std::cout << std::endl;

    err = deflateEnd(&c_stream);
    CHECK_ERR(err, "deflateEnd");
}

/* ===========================================================================
 * Test inflate()
 */
void test_inflate(std::vector<uint8_t> &compr,
                  std::vector<uint8_t> &uncompr)
{
    int err;
    z_stream d_stream; /* decompression stream */

    uncompr.clear();

    d_stream.zalloc = Z_NULL;
    d_stream.zfree = Z_NULL;
    d_stream.opaque = Z_NULL;
    d_stream.avail_in = 0;
    d_stream.next_in = Z_NULL;
    err = inflateInit(&d_stream);
    CHECK_ERR(err, "inflateInit");

    d_stream.avail_in = compr.size();
    d_stream.next_in  = &compr[0];

    for(;;) {
        uint8_t d_buffer[10] = {};
        d_stream.next_out = &d_buffer[0];
        d_stream.avail_out = 10;

        err = inflate(&d_stream, Z_NO_FLUSH);

        if (err == Z_STREAM_END) {
            for (int i = 0; i < (10 - d_stream.avail_out); i++)
                uncompr.push_back(d_buffer[i]);
            if (d_stream.avail_in == 0)
                break;
        }

        CHECK_ERR(err, "inflate");
        for (int i = 0; i < (10 - d_stream.avail_out); i++)
            uncompr.push_back(d_buffer[i]);
    }
    err = inflateEnd(&d_stream);
    CHECK_ERR(err, "inflateEnd");

    std::cout << "Uncompressed data (size = " << std::dec << uncompr.size() << ") = ";
    for (int i = 0; i < uncompr.size(); i++)
        std::cout << (uint32_t) uncompr[i];
    std::cout << std::endl;
}


/* ===========================================================================
 * Usage:  example
 */

int main(int argc, char **argv)
{
    std::vector<uint8_t> input_data;
    std::vector<uint8_t> compr, multiple_compr;
    std::vector<uint8_t> uncompr;

    std::cout << "Input Data (in hex) = ";
    for (int i=0; i<32; i++) {
        input_data.push_back((uint8_t)i);
        if( i && (i % 2 == 0))
            std::cout << " ";
        std::cout << std::hex << (uint32_t)input_data[i];
    }
    std::cout << std::endl;

    // create compressed buffer-1 from input data
    test_deflate(input_data, compr);

    // copy compressed buffer-1 data into multiple compressed member buffer
    multiple_compr = compr;
    compr.clear();

    // create compressed buffer-2 from input data
    test_deflate(input_data, compr);

    // append data of compressed buffer-2 into multiple compressed member buffer
    for(int i=0; i< compr.size(); i++)
    {
        multiple_compr.push_back(compr[i]);
    }

    // create decompressed output
    test_inflate(multiple_compr, uncompr);

    // compare decompressed data with input data
    std::vector<uint8_t> final_data;
    final_data.push_back(input_data);
    final_data.push_back(input_data);
    if (final_data == uncompr)
       std::cout << "Matched" << std::endl;
    else
       std::cout << "Not Matched" << std::endl;

    return 0;
}

1) Here second time inflate call returns error, But I wants it proceed successfully why it work like this ?

2) When I use Z_FINISH in the inflate call argument it returns with error, why can't I use Z_FINISH here ?

Kindly correct my example and suggest some optimized approach to do the same.

1
A bit unclear. You get chunks data and decompress it. But you don't know how many chunks you may get? It seems this is unrelated from the entire zlib story you wrapped around it.Jongware
Yes it might be as I found in zlib inflate call, Still what I want to know that is whether there is any mechanism like I am assigning data buffer to avail_in, If the multiple compressed member(i.e. 3) are there then after decompressing first member avail_in still should have data of other compressed members(i.e. 2) Can't I proceed with that ? Still as I am learning zlib usage idea is not much clear to me yet properly That is why it would be better if you could share some proper usage example of zlib inflate After referring zlib manual it's still not much clear to me.ronex dicapriyo
Ah, wait. So after decompressing, there might be data remaining in your input buffer. See zlib.net/zlib_how.html, it mentions this about halfway through. (Not ever used this myself so I'm going to bail out here.)Jongware

1 Answers

1
votes

Simply repeat the inflate operation on the remaining data.

You can save some unnecessary free's and malloc's by using inflateReset() instead of inflateEnd() and inflateInit(). You may have some leftover data from the last inflate in next_in and avail_in, so use that first, and then reload.