1
votes

I have an application that calculates the crc32 over some stream of data of length l. However I want to remove the last 4 bytes I crc'ed from the final crc32 result meaning that I actually want the result to be the crc32 of the data over length (l-4). Is there an efficient way to do this?

Edit: I know the last 4 bytes I want to exclude.

1
Do you still know the last 4 bytes? And why can't you just stop hashing when you reach L - 4?AKX
I know the last 4 bytes which I want to exclude. During normal data transfer I get a stream of data of 231 bytes. The last transfer can take any amount of bytes from 1 up to 231 bytes. So If the last transfer is only 1 bytes long, my crc32 would already include 3 bytes I didn't want to include.TrinityTonic

1 Answers

2
votes

Yes it's possible.

First, CRC is linear, so we can find what the CRC would have been if those last 4 bytes were 0 by computing crcOfData ^ crc(last4Bytes). There are some minor variations there depending on the details of your CRC though.

Secondly, the action of "remove the last bit assuming it was zero" can be modeled by a 32x32 boolean matrix, namely:

uint32_t inv1[32];
uint32_t row = 2;
for (int n = 0; n < 31; n++) {
    inv1[n] = row;
    row <<= 1;
}
inv1[31] = 0x05EC76F1; // reciprocal of your crc polynomial (I used the one that matches _mm_crc32)

The matrix for "remove 32 zero bits" can be found by squaring the matrix a couple of times:

uint32_t inv[32];
gf2_matrix_square(inv, inv1); // 2
gf2_matrix_square(inv1, inv); // 4
gf2_matrix_square(inv, inv1); // 8
gf2_matrix_square(inv1, inv); // 16
gf2_matrix_square(inv, inv1); // 32


uint32_t gf2_matrix_times(uint32_t *mat, uint32_t vec)
{
    uint32_t sum = 0;
    while (vec) {
        if (vec & 1)
            sum ^= *mat;
        vec >>= 1;
        mat++;
    }
    return sum;
}

void gf2_matrix_square(uint32_t *square, uint32_t *mat)
{
    for (int n = 0; n < 32; n++)
        square[n] = gf2_matrix_times(mat, mat[n]);
}

Since squaring that matrix 5 times is independent of the data, you could hard-code the result.

The actual "remove 4 bytes" would be found with gf2_matrix_times(inv, crcOfData ^ crc(last4Bytes)), for example just to verify that it worked:

auto crc0 = _mm_crc32_u32(0, 0xDEADBEEF);
auto crc1 = _mm_crc32_u32(crc0, 0xCAFEBABE);
auto undo = gf2_matrix_times(inv, crc1 ^ _mm_crc32_u32(0, 0xCAFEBABE));