I have around 270k data block pairs, each pair consists of one 32KiB and one 16KiB block.
When I save them to one file I of course get a very large file.
But the data is easily compressed.
After compressing the 5.48GiB file with WinRAR, with strong compression, the resulting file size is 37.4MiB.
But I need random access to each individual block, so I can only compress the blocks individually.
For that I used the Deflate class provided by .NET, which reduced the file size to 382MiB (which I could live with).
But the speed is not good enough.
A lot of the speed loss is probably due to always creating a new MemoryStream and Deflate instance for each block. But it seems they aren't designed to be reused.
And I guess (much?) better compression can be achieved when a "global" dictionary is used instead having one for each block.
Is there an implementation of a compression algorithm (preferably in C#) which is suited for that task?
The following link contains the percentage with which each byte number occurs, divided into three block types (32KiB blocks only). The first and third block type has an occurrence of 37,5% and the second 25%. Block type percentages
Long file short story: Type1 consists mostly of ones. Type2 consists mostly of zeros and ones Type3 consists mostly of zeros Values greater than 128 do not occur (yet).
The 16KiB block consists almost always of zeros