Usual URL shortening techniques use few characters of the usual URL-charset, because not need more. Typical short URL is http://domain/code
, where code is a integer number. Suppose that I can use any base (base10, base16, base36, base62, etc.) to represent the number.
QR Code have many encoding modes, and we can optimize the QR Code (minimal version to obtain lowest density), so we can test pairs of baseX-modeY...
What is the best base-mode pair?
NOTES
A guess...
Two modes fit with the "URL shortening profile",
- 0010 - Alphanumeric encoding (11 bits per 2 characters)
- 0100- Byte encoding (8 bits per character)
My choice was "upper case base36" and Alphanumeric (that also encodes "/", ":", etc.), but not see any demonstration that it is always (for any URL-length) the best. There are some good Guide or Mathematical demonstration about this kind of optimization?
The ideal (perhaps impracticable)
There are another variation, "encoding modes can be mixed as needed within a QR symbol" (Wikipedia)... So, we can use also
HTTP://DOMAIN/
with Alphanumeric + change_mode + Numeric encoding (10 bits per 3 digits)
For long URLs (long integers), of course, this is the best solution (!), because use all charset, no loose... Is it?
The problem is that this kind of optimization (mixed mode) is not accessible in usual QRCode-image generators... it is practicable? There are one generator using correctally?
An alternative answer format
The (practicable) question is about best combination of base and mode, so we can express it as a (eg. Javascript) function,
function bestBaseMode(domain,number_range) {
var dom_len = domain.length;
var urlBase_len = dom_len+8; // 8 = "http://".length + "/".length;
var num_min = number_range[0];
var num_max = number_range[1];
// ... check optimal base and mode
return [base,mode];
}
Example-1: the domain is "bit.ly" and the code is a ISO3166-1-numeric country-code,
ranging from 4 to 894. So urlBase_len=14
, num_min=4
and num_max=894
.
Example-2: the domain is "postcode-resolver.org" and number_range parameter is the range of most frequent postal codes integer representations, for instance a statistically inferred range from ~999 to ~999999. So urlBase_len=27
, num_min=999
and num_max=9999999
.
Example-3: the domain is "my-example3.net" and number_range a double SHA-1 code, so a fixed length code with 40 bytes (2 concatenated hexadecimal 40 digits long numbers). So num_max=num_min=Math.pow(8,40)
.