Assessing the quality of an image with respect to compression?

Question

I have images that I am using for a computer vision task. The task is sensitive to image quality. I'd like to remove all images that are below a certain threshold, but I am unsure if there is any method/heuristic to automatically detect images that are heavily compressed via JPEG. Anyone have an idea?

Did you end up solving your problem? Have a look at my updated answer. I ended up implementing it over the weekend (combining anisotropic diffusion and SSIM approach) and it seems to give promising results. I used anisotropic diffusion instead of Gaussian blurring. — mpenkov

mpenkov mpenkov · Accepted Answer · 2010-11-16T12:29:59

Image Quality Assessment is a rapidly developing research field. As you don't mention being able to access the original (uncompressed) images, you are interested in no reference image quality assessment. This is actually a pretty hard problem, but here are some points to get you started:

Since you mention JPEG, there are two major degradation features that manifest themselves in JPEG-compressed images: blocking and blurring
No-reference image quality assessment metrics typically look for those two features
Blocking is fairly easy to pick up, as it appears only on macroblock boundaries. Macroblocks are a fixed size -- 8x8 or 16x16 depending on what the image was encoded with
Blurring is a bit more difficult. It occurs because higher frequencies in the image have been attenuated (removed). You can break up the image into blocks, DCT (Discrete Cosine Transform) each block and look at the high-frequency components of the DCT result. If the high-frequency components are lacking for a majority of blocks, then you are probably looking at a blurry image
Another approach to blur detection is to measure the average width of edges of the image. Perform Sobel edge detection on the image and then measure the distance between local minima/maxima on each side of the edge. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous approach. "No Reference Block Based Blur Detection" by Debing is a more recent paper

Regardless of what metric you use, think about how you will deal with false positives/negatives. As opposed to simple thresholding, I'd use the metric result to sort the images and then snip the end of the list that looks like it contains only blurry images.

Your task will be a lot simpler if your image set contains fairly similar content (e.g. faces only). This is because the image quality assessment metrics can often be influenced by image content, unfortunately.

Google Scholar is truly your friend here. I wish I could give you a concrete solution, but I don't have one yet -- if I did, I'd be a very successful Masters student.

UPDATE:

Just thought of another idea: for each image, re-compress the image with JPEG and examine the change in file size before and after re-compression. If the file size after re-compression is significantly smaller than before, then it's likely the image is not heavily compressed, because it had some significant detail that was removed by re-compression. Otherwise (very little difference or file size after re-compression is greater) it is likely that the image was heavily compressed.

The use of the quality setting during re-compression will allow you to determine what exactly heavily compressed means.

If you're on Linux, this shouldn't be too hard to implement using bash and imageMagick's convert utility.

You can try other variations of this approach:

Instead of JPEG compression, try another form of degradation, such as Gaussian blurring
Instead of merely comparing file-sizes, try a full reference metric such as SSIM -- there's an OpenCV implementation freely available. Other implementations (e.g. Matlab, C#) also exist, so look around.

Let me know how you go.

Assessing the quality of an image with respect to compression?

3 Answers