Why is it advantageous for the table size to be prime when using secondary Hashing?

Question

When it comes to secondary hashing, why is it advantageous for the size of the table to be a prime number? I don't quite understand this point.

templatetypedef templatetypedef · Accepted Answer · 2014-09-03T21:16:25

In secondary ("double") hashing, you have two hash functions h1 and h2 and the probe sequence is formed by evaluating

probe(x, i) = (h1(x) + i · h2(x)) mod tableSize

Let's suppose that tableSize isn't a prime number. For example, suppose that the table size is 12. Imagine that you have hashes h1 and h2 so that for some input x, we get h1(x) = 0 and h2(x) = 6. In this case, the probe sequence will be 0, 6, 0, 6, 0, 6, ..., which doesn't guarantee uniform coverage of the table (in fact, it just visits two elements over and over again). More generally, if the value of h2(x) is a nontrivial divisor of the table size, then the probe sequence will not cover all the elements of the table.

So how do you ensure that you never have h2(x) be a nontrivial divisor of the table size? Well, one simple way to do that would be to make the table size prime - after all, prime numbers, by definition, have no nontrivial divisors! If you do choose a prime table size, you're guaranteed that the probe sequence will visit every element of the table once and exactly once, though proving this requires a bit of modular arithmetic.

Hope this helps!

Why is it advantageous for the table size to be prime when using secondary Hashing?

1 Answers