1
votes

With Huffman encoding we simply generate a map of symbol -> code. Then, when run-length encoding, we use this map to exchange a symbol with a code. This allows for easy mixing of codes with some other symbols that we didn't want to encode/compress. For instance in JPEG we encode [number of preceeding zeros, number of bits for AC coefficient] and put it to bitstream, followed by the AC coefficient bit representation. This is a very convenient property of Huffman encoding.

Now what I want to ask is if this is possible to do something similar with arithmetic encoding (in context of asymmetric numeral systems cause that's what I'm implementing)? I have no idea how to tackle this.

3

3 Answers

1
votes

There are various ways for mixing raw bits, see for example bypass coding in: https://fgiesen.wordpress.com/2015/12/21/rans-in-practice/ And generally, the best place to get help in data compression is encode.ru forum.

0
votes

Sure, assuming that you know how many symbols to decode from the arithmetic code/ANS before switching to some other interpretation of the subsequent bits. This is no different from the Huffman case, where you know how many symbols to read before changing your interpretation of the bits.

0
votes

@Jarek Duda: Thanks for the link. I went there and with the help of source code was able to implement write/read of raw bits. I would like though to ask for some explanation on how normalization works. My current implementation of WriteRaw looks like this:

uint32 _x;
deque<uint16> _words;

...

void NCommon::ANSCoder32::WriteRaw(uint16 value, uint8 bitsCount)
{
    uint32 freq = 1 << (16 - bitsCount);
    uint32 maxX = ((uint32)1 << 16) * freq;

    if (_x >= maxX)
    {
        _words.push_back(_x & 0xFFFF);
        _x >>= 16;
    }

    _x = (_x << bitsCount) | value;
}

I figured what freq and maxX (based on the website provided) should be but have no idea why they are defined the way they are. I would really welcome some explanation.