3
votes

at section 5.5 of the PNG Specification, it discusses this concept in the PNG file format called "CRC" or "Cyclic Redundancy Code". I've never heard of it before, so I'm trying to understand it.

The CRC polynomial employed is

x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

In PNG, the 32-bit CRC is initialized to all 1's, and then the data from each byte is processed from the least significant bit (1) to the most significant bit (128). After all the data bytes are processed, the CRC is inverted (its ones complement is taken). This value is transmitted (stored in the datastream) MSB first. For the purpose of separating into bytes and ordering, the least significant bit of the 32-bit CRC is defined to be the coefficient of the x31 term.

So let me tell you what I understand and what I don't understand about this.

I've heard of polynomials, but in this context I'm a bit confused of how they are implemented here.

In this case, what is "x" supposed to represent? The current bit in the 32-bit looP? Which brings us to the next part:

so it says to make an empty 32-bit number (or rather, all set to 1s, so 32 1s), then it says it's "processed from the least significant bit (1) to the most significant bit (128)", but the question is, the "least...most..significant bit" of what?

Of the other data in the chunk?

How does that work, if the chunk is set in bytes, and this is only 32 bits? What if there are more than 32 bits in the chunk data (which there definitely are?)

Does it mean "least..most..significant bit" of the "polynomial"?

But what does the polynomial represent exactly? What is x^32?

What is x represented by?

Any help with the above questions, and perhaps a simple example with the example IDATA chunk (AKA calculating the CRC chunk for it with basic explanations) would be great:

0 0 2 3 IDAT 0 1 0 1 0 1 0 1 0 1 0 C

where the last byte "C" should be replaced with that 32-bit CRC thing it was talking about.

Can someone provide me with a practical example?

3
" I don't know where they got that big bit number in the for loop, and I don't know where to insert the byte array" - It's one of the standard numbers for a 32 bit CRC. Go to the wiki article, and scroll down to the 32 bit CRC's, you'll see it there How CRCs are chosen is complicated, and I don't know why specific CRCs are chosen out of a group of CRCs that have the same error detection capability. - rcgldr
@rcgldr hi thanks, I noticed on the wiki it gives a few bit examples, using the same polynomial from other example questions. The question is I want to understand how to arrive at that 32-bit array, from the polynomial. I'm still very confused what "x" represents, it mentioned in the other answer that nothing is "plugged into" it, but then what is it, and how does that polynomial give the 4 different bit results? And the other point is, once I have the 32-bit string, but then what do I do to it with the other input, do I loop over the other byte array then do something, but what? - bluejayke
@rcgldr also I searched for a javascript solution, all I was able to find was this medium.com/the-guardian-mobile-innovation-lab/… but they use a built in function for the CRC.. most of the others were either using canvas, or some other library, or server-side code - bluejayke
@rcgldr the thing is I'm not interested in multiple formats nor compression, I'm only interested in a very simple PNG generator (mainly to run inside google apps script), and there are hardly any extremely light weight javascirpt PNG libraries (pngJS is too overkill) to achieve what I want - bluejayke
OK, but isn't zlib compression header and CRC (which I get the impression is separate from the PNG CRC) needed even if the compression is turned off? You'll need something for that. - rcgldr

3 Answers

1
votes

Beware: If you use (00000000)_2 and (00000001)_2 as the binary representations of the 0s and 1s in your example IDAT chunk, you will compute the CRC incorrectly. The ASCII values of '0' and '1' are 48 = (00110000)_2 and 49 = (00110001)_2; similarly, the ASCII values of 'I', 'D', 'A', and 'T' are 73 = (01001001)_2, 68 = (01000100)_2, 65 = (01000001)_2, and 84 = (01010100)_2. So, assuming you meant the values 0 and 1 rather than the characters ‘0’ and ‘1’, what you must compute the CRC of is (01001001 01000100 01000001 01010100 00000000 00000001 00000000 00000001 00000000 00000001 00000000 00000001 00000000 00000001 00000000)_2.

Inconsequentially to the CRC but consequentially to the validity of the chunk, the length field (i.e., the first 4 bytes) of the chunk should contain the length in bytes of the data only, which is 11, which is the ASCII value of a vertical tab (VT), which is a nonprinting character but can be represented in strings by the hexadecimal escape sequence \x0B (in which (B)_16 = 11). Similarly, the first 3 bytes must contain the character for which the ASCII value is 0 (rather than 48), which is null (NUL), which can be represented in strings by the hexadecimal escape sequence \x00. So, the length field must contain something like "\x00\x00\x00\x0B".

2
votes

I would recommend reading Ross Williams' classic "A Painless Guide to CRC Error Detection Algorithms". Therein you will find in-depth explanations and examples.

The polynomial is simply a different way to interpret a string of bits. When you have n bits in a register, they are most commonly interpreted either as just that, a list of n independent bits, or they are interpreted as an integer, where you multiply each bit by two raised to the powers 0 to n-1 and add them up. The polynomial representation is where you instead interpret each bit as the coefficient of a polynomial. Since a bit can only be a 0 or a 1, the resulting polynomials never actually show the 0 or 1. Instead the xn term is either there or not. So the four bits 1011 can be interpreted to be 1 x3 + 0 x2 + 1 x1 + 1 x0 = x3 + x + 1. Note that I made the choice that the most significant bit was the coefficient of the x3 term. That is an arbitrary choice, where I could have chosen the other direction.

As for what x is, it is simply a placeholder for the coefficient and the power of x. You never set x to some value, nor determine anything about x. What it does is allow you to operate on those bit strings as polynomials. When doing operations on these polynomials, you treat them just like the polynomials you had in algebra class, except that the coefficients are constrained to the field GF(2), where the coefficients can only be 0 or 1. Multiplication becomes the and operation, and addition becomes the exclusive-or operation. So 1 plus 1 is 0. You get a new and different way to add, multiply, and divide strings of bits. That different way is key to many error detection and correction schemes.

It is interesting, but ultimately irrelevant, that if you set x to 2 in the polynomial representation of a string of bits (with the proper ordering choice), you get the integer interpretation of that string of bits.

2
votes

The spec includes a link to example code:

https://www.w3.org/TR/2003/REC-PNG-20031110/#D-CRCAppendix

The spec has errors or is confusing.

That should be "data from each byte is processed from the least significant bit(0) to the most significant bit bit(7).

The CRC is a 33 term polynomial, where each term has a one bit coefficient, 0 or 1, with the 0 coefficients ignored when describing the polynomial.

Think of the CRC as being held in a 32 bit register. The sequence is to xor a byte of data into the right most byte of the CRC register, bits 7 through 0 (which technically correspond to the polynomial coefficients of x^24 to x^31). Then the CRC is "cycled" to the right for 8 bits (via table lookup). Once all data bytes have gone through this cycle, based on the comment from Mark Adler, it's the CRC is appended to data most significant byte first, (CRC>>24)&0xff, (CRC>>16)&0xff, (CRC>>8)&0xff, (CRC)&0xff.

The wiki article may help. For the example in the computation section, the dividend would be an array of data bytes with the bits of each byte reversed, the bits of the 33 bit polynomial would be non-reversed (0x104C11DB7). After doing the computation, the bits of the remainder would be reversed and appended to the data bytes.

https://en.wikipedia.org/wiki/Cyclic_redundancy_check


Mark Adler's answer includes a link to a good tutorial for a CRC. His answer also explains the x's used in a polynomial. It's just like a polynomial in algebra, except the coefficients can only be 0 or 1, and addition (or subtraction) is done using XOR.


what is x

From the wiki example:

data     = 11010011101100 = x^13 + x^12 + x^10 + x^7 + x^6 + x^5 + x^3 + x^2
divisor  =           1011 = x^3 + x + 1

Three 0 bits are appended to the data, effectively multiplying it by x^3:

dividend = 11010011101100000 = x^16 + x^15 + x^13 + x^10 + x^9 + x^8 + x^6 + x^5

Then the crc = dividend % divisor, with coefficients restricted to 0 or 1.

(x^16 + x^15 + x^13 + x^10 + x^9 + x^8 + x^6 + x^5) % (x^3 + x + 1) = x^2
11010011101100000 % 1011 = 100