I have been learning about Radix sort recently and one of the sources I have used is the Wikipedia page. At the moment there is the following paragraph there regarding the efficiency of the algorithm:
The topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky and subject to quite a lot of misunderstandings. Whether radix sort is equally efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made. Radix sort complexity is O(wn) for n keys which are integers of word size w. Sometimes w is presented as a constant, which would make radix sort better (for sufficiently large n) than the best comparison-based sorting algorithms, which all perform O(n log n) comparisons to sort n keys. However, in general w cannot be considered a constant: if all n keys are distinct, then w has to be at least log n for a random-access machine to be able to store them in memory, which gives at best a time complexity O(n log n). That would seem to make radix sort at most equally efficient as the best comparison-based sorts (and worse if keys are much longer than log n).
The part in bold has regrettably become a bit of a block that I am unable to get past. I understand that in general Radix sort is O(wn), and through other sources have seen how O(n) can be achieved, but cannot quite understand why n distinct keys requires O(n log n) time for storage in a random-access machine. I'm fairly certain it comes down to some simple mathematics, but unfortunately a solid understanding remains just beyond my grasp.
My closest attempt is as follows:
Given a base, 'B' and a number in that base, 'N', The maximum digits 'N' can have is:
(logB of N) + 1.
If each number in a given list, L, is unique, then we have up to:
L *((logB of N) + 1) possibilities
At which point I'm unsure how to progress.
Is anyone able to please expand on the above section in bold and break down why n distinct keys requires a minimum of log n for random-access storage?