3
votes

Assume A is an array where A[0] holds the frequency of 0-th letter of the alphabet.

What is the most efficient(*) way of calculating code lengths? Not sure, but I guess efficiency can be in terms of memory usage or steps required.

All I'm interested is the array L where L[0] contains code lengths (number of bits) of 0-th letter of the alphabet, where code comes from canonical huffman tree built out of A frequency array.

1

1 Answers

5
votes

If frequencies form a monotonic sequence, ie. A[0]<=A[1]<=...<=A[n-1] or A[0]>=A[1]>=...>=A[n-1], then you can generate an optimal code lengths in O(n) time and O(1) additional space. This algorithm requires only 2 simple passes over the array and it's very fast. A full description is given in [1].

If your frequencies aren't sorted, first you need to sort them and then apply the above algorithm. In this case time complexity is O(n log n) and an auxiliary array of n integers is needed to store sorted order - space complexity O(n).

[1]: In-Place Calculation of Minimum-Redundancy Codes by Alistair Moffat and Jyrki Katajainen, available online: http://www.diku.dk/~jyrki/Paper/WADS95.pdf