1
votes

Well I am currently trying to implement a compression algorithm in my project, it has to be lz77 as a matter of effect... I am already able to decompress data but I can't imagine where to start in terms of compressing. I thought it would be best to pass by a byte array with data, but that's about it... My problem is that all descriptions of the algorithm are quite cryptic for me to understand. I would appreciate a clear description of how the algorithm works and what I have to watch for. Also: Am I obliged to use unsafe methods and pointers when coding in C#? It would be better to avoid that... I suppose.

Here is what I got so far thanks to the information you gave me:

private const int searchWindow = 4095;
    private const byte lookaheadWindow = 15;
    public static byte[] lzCompressData(byte[] input)
    {
        int position = 0;
        List<byte> tempInput = input.ToList();
        List<byte> output = new List<byte>();
        MemoryStream init = new MemoryStream();
        BinaryWriter inbw = new BinaryWriter(init);
        inbw.Write(((input.Length << 8) & 0xFFFFFF00) | 0x10);
        output.AddRange(init.ToArray());
        while (position < input.Length)
        {
            byte decoder = 0;
            List<byte> tempOutput = new List<byte>();
            for (int i = 0; i < 8; ++i)
            {
                List<byte> eligible;
                if(position < 255)
                {
                    eligible = tempInput.GetRange(0, position);
                }
                else
                {
                    eligible = tempInput.GetRange(position - searchWindow, searchWindow);
                }
                if (!(position > input.Length - 8))
                {
                    MemoryStream ms = new MemoryStream(eligible.ToArray());
                    List<byte> currentSequence = new List<byte>();
                    currentSequence.Add(input[position]);
                    int offset = 0;
                    int length = 0;
                    long tempoffset = StreamHelper.FindPosition(ms, currentSequence.ToArray());
                    while ((tempoffset != -1) && (length < lookaheadWindow) && position < input.Length - 8)
                    {
                        offset = (int)tempoffset;
                        length = currentSequence.Count;
                        position++;
                        currentSequence.Add(input[position]);

                    }
                    if (length >= 3)
                    {
                        decoder = (byte)(decoder | (byte)(1 << i));
                        byte b1 = (byte)((length << 4) | (offset >> 8));
                        byte b2 = (byte)(offset & 0xFF);
                        tempOutput.Add(b1);
                        tempOutput.Add(b2);
                    }
                    else
                    {
                        tempOutput.Add(input[position]);
                        position++;
                    }
                }
                else
                {
                    if (position < input.Length)
                    {
                        tempOutput.Add(input[position]);
                        position++;
                    }
                    else
                    {
                        tempOutput.Add(0xFF);
                    }
                }
            }
            output.Add(decoder);
            output.AddRange(tempOutput.ToArray());
        }
        return output.ToArray();
    }
1
Is there a particular reason you don't just use the supplied GZipStream? If you just want something that works, that'll do. If you want to write your own as an exercise, you'll have to do a lot of research.Jim Mischel

1 Answers

2
votes

I would apreciate a clear description of how the algorithm works and what I have to watch for

it is very well explained here . If you have problem in understanding something specific please ask that.

Am I obliged to use unsafe methods and pointers when coding in C#

You don't have to worry about anything. No need to re-invent the wheel. it is already implemented. its implementation