0
votes

I have some images using different encoding types (JPEG Lossless and baseline JPEG), and I want to extract this information from my images.

I've tried a bunch of 'exif' parsers/readers but none works on my images because of the special JPEG Lossless (SOF 3).

I tried with some online tools (get-metadata dot com, etc...) and it works fine, that's what is outputted :

Encoding Process: Lossless, Huffman coding

Or for a normal JPEG :

Encoding Process: Baseline DCT, Huffman coding

Do you know an easy way or a library that could do it? I have the image as a buffer in NodeJS.

For information, this is the code I'm using right now, it's working - but I'm not sure it's reliable :

//0xFF 0xD8 means that it is a JPEG image
if (pixelData[0] === 0xFF && pixelData[1] === 0xD8) {

    //this field contains the encoding process
    //see https://www.loc.gov/preservation/digital/formats/fdd/fdd000334.shtml
    //see https://en.wikipedia.org/wiki/JPEG for more info
    // NOT SURE ABOUT THIS ?
    const jpegProcess = pixelData[21];

    //0xC3 0xC7 0xCB 0xCF is for JPEG Lossless compression (SOF3)
    if (jpegProcess === 0xC3 || jpegProcess === 0xC7 || jpegProcess === 0xCB || jpegProcess === 0xCF) {
        ...decode image
    } //0xC0 0xC2 0xDB are baseline JPEG
    else {
        ...do other stuff...
    }
}
1
How you though of using imagemagick?Luis Estevez
It seems like a huge module for just getting the metadataHRK44
It's not a node_module. You install the cli for imagemagick and have nodejs write commands to the shell... for example: exec('magick identify -verbose image.jpg')Luis Estevez
It doesn't seem to handle SOF3 images, also I'm working with docker containers and can't install too much programsHRK44

1 Answers

0
votes

You need a little more work in your code. You needs to skip over blocks with length fields. These can contain the raw value FF. Then you need to find a start of frame market and identify its type.