Say I have a .txt file like this:
11111111111111Hello and welcome to stackoverflow. stackoverflow will hopefully provide me with answers to answers i do not know. Hello and goodbye.11111111111111
Then I would have an equivalent in binary form (.bin file) created as such:
Stream.Write(intBytes, 0, intBytes.Length); // 11111111111111
Stream.Write(junkText, 0, junkText.Length); // Hello and welcome to stackoverflow...
Stream.Write(intBytes, 0, intBytes.Length); // 11111111111111
The first example compresses better than the second. If i removed the 11111111111111 they compress to the same size. But having the 11111's means the .txt version compresses better.
byte[] intBytes = BitConverter.GetBytes(11111111111111); // This is 8 bytes
byte[] strBytes = UTF8Encoding.UTF8.GetBytes("11111111111111"); // This is 14 bytes
This is using the native C++ Zlib library.
Before compression the .bin file is lesser in size and I was expecting this.
Why is it that after compression the .txt version is lesser in size? It seems it compresses that better than the bin equivalent.
bin file: Uncompressed Size:2448 Compressed Size:177
txt file: Uncompressed Size:2460 Compressed Size:167