20
votes

I'm wondering if CRC32 sum and CRC32C in particular ever return to 0? The simple answer would be "yes" given a large enough data set. However, I was wondering if there is any provisioning in CRC32C standard that would explicitly prevent this from happening.

The use case for this is that I need to be able to check if remote file is empty and all I have is its CRC32C check sum. So, in other words can I infer that if CRC32C is 0 then file is guaranteed to be empty.

If possible, please provide any reference to a standard where this is defined.

3
Can you use your own checksum? In that case, define zero to be only used for the empty file. If zero happens to be produced by the hash function, just set it to 1.usr
You know the CRC32 value but not the length of the file? Huh?kay
@usr CRC32C algorith is highly optimized for speed and is implemented in hardware on Intel CPUs. I need this for calculations at wire speed, so custom implementation is not an option.dtoux
@Kay This is just an example. The actual use case is more complicated than that.dtoux
@dtoux you only need to append: if (crcValue == 0) crcValue = 1;. That's all.usr

3 Answers

27
votes

@Yanek is almost completely correct.

Just for fun, here is a five-character sequence that gives a CRC-32C of zero: DYB|O. Here is a four-byte sequence in hex that gives zero: ab 9b e0 9b. In fact, that is the only four-byte sequence that can do so. There are no three-byte or shorter sequences that will give you zero. That is where @Yanek is not exactly right, in that for three-byte or shorter sequences, zero is not just as likely. The probability of getting a zero is zero in those cases.

22
votes

A zero is as likely as any other value of a CRC32 checksum. A CRC is essentially the remainder of dividing the entire input (taken as one large binary number) by a pre-selected value. If the input happens to be divisible by that value, the remainder, and thus the CRC, is zero.

0
votes

How about this, not a 32-bit CRC, though:

1011 | 110011001010.000
       1011
       ----
        1111
        1011
        ----
         1001
         1011
         ----
           1000
           1011
           ----
             1110
             1011
             ----
              1011
              1011
              ----
                  0000 (...)
                  1011
                  ----
                  1011
                  1011
                  ----
                  0000

Or:

1100 | 11001010.000
       1100
       ----
           1010
           1100
           ----
            1100
            1100
            ----
            (...) 0