1
votes

I am trying to use zlib to deflate (compress?) data from a textfile. It seems to work when I compress a file, but I am trying to prepend the zlib compressed file with custom header. Both the file and header should be compressed. However, when I add the header, the length of the compressed (deflated) file is much shorter than expected and comes out as an invalid zlib compressed object.

The code works great, until I add the header block of code between the XXX comments below.

The "FILE *source" variable is a sample file, I typically use /etc/passwd and the "char *header" is "blob 2172\0". Without the header block, the output is 904 bytes and deflatable (decompressable), but with the header it comes out to only 30 bytes. It also comes out as an invalid zlib object with the header block of code.

Any ideas where I am making a mistake, specifically why the output is invalid and shorter with the header?

If its relevant, I am writing this on FreeBSD.

#define Z_CHUNK16384
#define HEX_DIGEST_LENGTH       257

int
zcompress_and_header(FILE *source, char *header)
{
int ret, flush;
z_stream strm;
unsigned int have;
unsigned char in[Z_CHUNK];
unsigned char out[Z_CHUNK];

FILE *dest = stdout; // This is a temporary test

strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
ret = deflateInit(&strm, Z_BEST_SPEED);
//ret = deflateInit2(&strm, Z_BEST_SPEED, Z_DEFLATED, 15 | 16, 8,
Z_DEFAULT_STRATEGY);

if (ret != Z_OK)
     return ret;

/* XXX Beginning of writing the header */

strm.next_in = (unsigned char *) header;
strm.avail_in = strlen(header) + 1;

do {
     strm.avail_out = Z_CHUNK;
     strm.next_out = out;
     if (deflate (& strm, Z_FINISH) < 0) {
          fprintf(stderr, "returned a bad status of.\n");
          exit(0);
     }
     have = Z_CHUNK - strm.avail_out;
     fwrite(out, 1, have, stdout);
} while(strm.avail_out == 0);

/* XXX End of writing the header */

do {
     strm.avail_in = fread(in, 1, Z_CHUNK, source);
     if (ferror(source)) {
          (void)deflateEnd(&strm);
          return Z_ERRNO;
     }

     flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
     strm.next_in = in;

     do {
          strm.avail_out = Z_CHUNK;
          strm.next_out = out;
          ret = deflate(&strm, flush);
          have = Z_CHUNK - strm.avail_out;
          if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
               (void)deflateEnd(&strm);
               return Z_ERRNO;
          }
     } while(strm.avail_out == 0);

} while (flush != Z_FINISH);

} // End of function
2
Some questions, for clarification: 1) you defined "#define Z_CHUNK16384", I suppese it should be "#define Z_CHUNK 16384". If Z_CHUNK is defined elsewhere, the program compiles but uses an unexpectec chunk size.Giuseppe Guerrini
2) You are using strlen on header. Is the header a null terminated string?Giuseppe Guerrini
Z_CHUNK is called CHUNK and is defined as 16384 in the zpipe.c example from the zlib documentation. Just from knowing the code, I know that header is not null, but I take your point of needing to add an error checking.Farhan Yusufzai
Ok, I just wanted to point out that there should be a space between "Z_CHUNK" and "16384" in your #define at line 1, otherwise you don't actually define the macro "Z_CHUNK", but "Z_CHUNK13384".Giuseppe Guerrini

2 Answers

1
votes

deflate is not an archiver. It only compresses a stream. Once the stream is exhausted, your options are very limited. The manual clearly says that

If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and deflate returns with Z_STREAM_END if there was enough output space. If deflate returns with Z_OK or Z_BUF_ERROR, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error. After deflate has returned Z_STREAM_END, the only possible operations on the stream are deflateReset or deflateEnd.

However, you are calling deflate for the file after you Z_FINISH the header, and zlib behaves unpredictably. The likely fix is to not use Z_FINISH for the header at all, and let the other side understand that the first line in the decompressed string is a header (or impose some archiving protocol understood by both sides).

0
votes

Your first calls of deflate() should use Z_NO_FLUSH, not Z_FINISH. Z_FINISH should only be used when the last of the data to be compressed is provided with the deflate() call.