5
votes

Are there any 32-bit checksum algorithm with either:

  • Smaller hash collision probability for input data sizes < 1 KB ?
  • Collision hits with more uniform distribution.

These relative to CRC32. I'm practically not counting on first property, because of limitation of storage space of 32 bits. But for the second ... seems there could be improvements.

Any ideas ? Thanks. (I need concrete implementation, better in C, but C++/ C# or anything to start with is also OK).

2
Are you using it as a checksum in an error-correction system, or are you using it as a hash function to probably-detect that two inputs are different by comparing their hashes? Error-correcting codes and hash functions have different desirable properties. In the case of CRC32, it's specifically designed to detect errors of the kind you expect on a noisy line (one bit or a few bits difference, not sure which).Steve Jessop
I'm using it as hash function to compare two peaces of small data. (< 1KB). But i'm forced to 32-bit hash.Agnius Vasiliauskas

2 Answers

4
votes

How about MurmurHash? It is said, that this hash has good distribution (passes chi-square tests) and good avalanche effect. Also very good computing speed.

0
votes

Not for the first criteria. Any well designed hash function with a 32 bit output has a 1 in 2^32 chance of a collision for any pair of inputs. The second criteria is not very well defined, although there are surely some statistical tests that could be used, and I'm sure someone has done it (chi-square for collision intervals?). As for needing an implementation, I strongly recommend that you not accept any proposed code for a hash function that is not an implementation of a well known hash, as there is a high risk of security problems or poor performance when rolling your own hash or encryption. A well known but bad hash function is better than one you designed yourself, even if the latter one tests well and has a 'good' collision distribution, simply because the former has more eyeballs on it.