4
votes

From the sample image below, I have a border in yellow just for display purposes only.

The actual .png file is a simple black/white image 3 pixels by 3 pixels. I was originally thinking to try as a 2x2, but that would not help trying to interpret low/hi vs hi/low drawing stream. At least this way, I would have two black, one white from the top, or one white, two black from the bottom..

So I read the chunks of data, get to the IDAT chunk, decode that (zlib) and come up with 12 bytes as follows

00 20 00 40 00 80

So, my question, how does the above get broken down into the 3x3 black and white sample... Also, it is saved in palette format and properly recognizes the bit depth of 1 and color palette of 2... color pallet[0] is RGBA all zeros. Palette1 has RGBA of 255, 255, 255, 0

I'll eventually get into the multiple other depth formats later, just wanted to start with what would expect to be the easiest.

Part II. Any guidance on handling the other depth formats would help if anything special to be considered especially regarding alpha channel (which I am already looking for in the palette) that might trip me up.

enter image description here

1
If you want to fully understand the IDAT format, just read the standard, it's quite simple: libpng.org/pub/png/spec/iso/index-object.html#11IDAT - leonbloy
Thanks for that additional document... I'll look into that too, but the one below actually cleared up a BUNCH for me that I did not actually follow from any spec... Graphics is not my primary, so I'm trying to learn vs just "here... use this." - DRapp

1 Answers

5
votes

It wouuld be easier if you use libpng, so I guess this is for learning purposes.

The thing is if you decompress the IDAT chunk directly, you get some data that is not supposed to be displayed and/or may need to be transformed (because a filter was applied) to get the actual bytes. In PNG format each line starts with an extra byte that tells you which filter was applied to that line, the remaining bytes contain the line pixels.

BTW, 00 20 00 40 00 80 are 6 bytes only (not 12, as you think). Now if you see this data as binary, your 3 lines would look like this:

00000000 00100000
00000000 01000000
00000000 10000000

Now, your image is 1 bit per pixel, so 1 byte is required to save a line of 3 pixels. The 3 highest bits are actually used (the 5 lower bits are ignored). I replaced the ignored bits with a x, so I think is easier to see the actual pixels (0 is black, 1 is white):

00000000 001xxxxx
00000000 010xxxxx
00000000 100xxxxx

In this case, no filter was applied to any line, because the first byte of each line is zero (0 means no filter applied, values from 1 to 4 means a filter was applied).