I use C programming language in Linux platform. I refer to the zlib usage example on zlib's official website (http://www.zlib.net/zlib_how.html) and write a compression program. Note that my compression method is gzip, which means using the deflateint2() function instead of deflateinit().
According to zlib's website,"CHUNK is simply the buffer size for feeding data to and pulling data from the zlib routines. Larger buffer sizes would be more efficient, especially for inflate(). If the memory is available, buffers sizes on the order of 128K or 256K bytes should be used. " So I think the bigger the CHUNK, the smaller the compressed file will be and the faster the compression speed will be.
But when I tested my program, I found that no matter the CHUNK size is 16384 or 1, the compressed file size is same (16384 is a typical value given by zlib official routine). The difference is that when the chunk size is 1, the compression speed is much slower.
This result makes me very confused. I think when the CHUNK size is 1, the compression processing is invalid. Because in this routine, each input CHUNK will be processed and output to a compressed file directly, and I think 1 byte of data cannot be compressed.
So my question is, why does the CHUNK size only affect the compression speed, but not the compression ratio?
Here's my program:
#define CHUNK 16384
int def(FILE *source, FILE *dest, int level, int memLevel)
{
int ret, flush;
unsigned have;
z_stream strm;
unsigned char in[CHUNK];
unsigned char out[CHUNK];
/* allocate deflate state */
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
ret = deflateInit2(&strm, level, Z_DEFLATED, MAX_WBITS + 16, memLevel, Z_DEFAULT_STRATEGY);
if (ret != Z_OK)
return ret;
/* compress until end of file */
do {
strm.avail_in = fread(in, 1, CHUNK, source);
if (ferror(source)) {
(void)deflateEnd(&strm);
return Z_ERRNO;
}
flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
strm.next_in = in;
/* run deflate() on input until output buffer not full, finish
compression if all of source has been read in */
do {
strm.avail_out = CHUNK;
strm.next_out = out;
ret = deflate(&strm, flush); /* no bad return value */
assert(ret != Z_STREAM_ERROR); /* state not clobbered */
have = CHUNK - strm.avail_out;
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)deflateEnd(&strm);
return Z_ERRNO;
}
} while (strm.avail_out == 0);
assert(strm.avail_in == 0); /* all input will be used */
/* done when last data in file processed */
} while (flush != Z_FINISH);
assert(ret == Z_STREAM_END); /* stream will be complete */
/* clean up and return */
(void)deflateEnd(&strm);
return Z_OK;
}