0
votes

Wikipedia states(wrongly apparently at least for real world status) that gzip format demands that last 4 bytes are uncompressed size (mod 4GB). I have fond the credible answer on SO that explains that sometimes there is junk at the end of the gzip data so you can not reply on last 4 bytes being size.

Unfortunately this matches my experiments(both terminal gzip and 7zip archiver add 0x0A byte for my small test example).

My question is what is the reason for this gzip and 7zip doing this? Obviously they do it like that because they are written to do that, but I wonder about the motivation to break the format specification. I know that some formats have padding requirements, but I found nothing for gzip.

edit:process:

echo "Testing rocks:) Debugging sucks :(" >> test_data

rm test_data.gz

gzip -6 test_data

vim -c "noautocmd edit test_data.gz"

in vim: :%!xxd -c 4

and last 5 bytes are size(35) and 0x0a (23 hex=35, then 00 00 00 0a)

7zip process is just using GUI to make a archive.

1
I did a gzip randomfile.txt and an hexdump. There is no 0x0a at end of file. gzip 1.6 Copyright (C) 2007, 2010, 2011 Free Software Foundation, Inc.Fabien
How did you test this?Aurel Bílý
Unless this question is updated to show the exact test procedures used, it will need to be closed.Jonathon Reinhart
done.....................................NoSenseEtAl

1 Answers

2
votes

Your testing process is wrong. Vim is what adds 0x0A to the end of the file. Here is a simpler test, using xxd directly (why did you even use Vim?):

echo "Testing rocks:) Debugging sucks :(" >> test_data
gzip -6 test_data
xxd -c 4 test_data.gz

Output:

0000000: 1f8b 0808  ....
0000004: 453c 5d59  E<]Y
0000008: 0003 7465  ..te
000000c: 7374 5f64  st_d
0000010: 6174 6100  ata.
0000014: 0b49 2d2e  .I-.
0000018: c9cc 4b57  ..KW
000001c: 28ca 4fce  (.O.
0000020: 2eb6 d254  ...T
0000024: 7049 4d2a  pIM*
0000028: 4d4f 0789  MO..
000002c: 1497 0245  ...E
0000030: 14ac 34b8  ..4.
0000034: 00f4 a724  ...$
0000038: 5623 0000  V#..
000003c: 00         .

As you can see, there is no 0x0A at the end. I think Vim adds newlines to the end of files by default, if they are not present.