I'm going to store records in Azure tables and use partition keys and/or row keys that represent integer values.
Since partition keys and row keys must be stored as strings, I need to choose an encoding scheme that translate between strings and integers.
The keys will have a range between 0 and 263 but most keys will have low values (typically less than 106).
I'm looking for an encoding scheme with the following properties:
Strings must be sortable in the same order as the corresponding integers.
Avoid overly long strings for common (low) values.
The encoding must support two modes; one that generates strings for ascending sort order; and one that generates strings for descending sort order.
Ideas:
Simply encode keys using 16 hexadecimal characters and use an inverse alphabet to achieve descending sort order.
While this is a simple and straight forward approach it has the drawback that it generates overly long strings for common (low) values.
Use a 7-bit encoding scheme similar to that in Unicode to generate smaller strings for low values.
Use the fact that Azure seem to support 16-bit Unicode characters in keys. While some characters are reserved and buggy, I think it should be possible to store at least 14 significant bits per character making it possible to represent all keys with as few as 5 characters.
Any suggestions?