1
votes

I am working on a compressing and encryption system and I have run into a little roadblock that I think I can overcome by thinking it through with some professionals.

Goal

I want a file container system that will be able to hold many types of files. I want to be able to compress (LZF) and encrypt (AES) each one based on certain rules that will be defined at compression/encryption time and that information stored in the file headers. I want to be able to retrieve files out of this container as streams. That means that the files have to be compressed/encrypted in blocks to be able to read them sequentially. Otherwise, I would have to decrypt then decompress the entire file at once to the memory and I am wanting this process to take as little memory as possible.

Current Status

I have completed the header system that contains information about the container itself, the file table and all the file's information. I am currently working on the file writing stream to create the actual containers.

Issue

While going through this process I have been trying to figure out how I can compress a file's data into a byte array and then encrypt it in blocks. I think I have settled on 1024 byte blocks which means that would be 64 blocks of AES encrypted data which would be stored since AES encrypts in 128 bit blocks. This whole system would be setup using streams. This means that I have no control over how much data would be sent to my system. My issue is that when compressing the data, I have absolutely no idea how big the data I am compressing will be. I could be smaller, the same size or even bigger than the original size. I need to be able to know how to decompress the data successfully in blocks.

Issue Example

Lets say I have a 128 byte block of information that I want encrypted, compressed and saved to this system that I have described. I would write it to the stream which in-turn would compress it. Lets also say that the 128 byte block is compressed down to 64 byte block. Then I send another block whose length is 256 and it is compressed down to 128 bytes. Both these blocks are copied into a buffer and then the new 384 byte buffer would be sent to be encrypted which would produce (24) 16 byte blocks. Then it would be written to the container system in a 1024 byte block.

In this example, I have no trouble decrypting the information since they are in blocks, but I can not say the same for the decompression step. From my understanding of compression, if I try to decrypt both the compressed 64 and 128 byte blocks, I will have invalid data because they were originally compressed separately. If this needs more clarification, please let me know.

1

1 Answers

1
votes

When you read the data from the stream you'll get a byte-count. Only on the last block it will be smaller than the block-size. That is your signal to call WriteFinalBlock and let the encryptor do its padding.