11
votes

This algorithm has been in my mind for a long time, but I cannot find it described anywhere. It's so simple though that I can't be the only one who has thought of it. Here's how it works:

You start with an image. Say, 7x7px:

Algorithm 1

You need to resample it say, to 5x5px:

Algorithm 2

So all you do is take the average color of each new square:

Algorithm 3

This isn't the nearest-neighbor, because that takes the color of just one pixel, not fractional pixels who happen to overlay the source pixel. It's also not bilinear, bicubic, lanczos or anything else interpolating.

So - what is it? It intuitively seems to me that this should be the "mathematically perfect" resampling algorithm, although since I don't have a definition of what "mathematically perfect" is, I cannot prove or disprove that.

Last but not least, "mathematically perfect" isn't always "best looking", so I wonder how it compares to other mainstream image resampling algorithms (bicubic, lanczos) in terms of "quality"? This is a subjective term, of course, so I'm really interested if there are significant differences between this algorithm and others, which most people would agree upon.

P.S. A few things I can already tell about it - it won't be "best looking" for pixel art, as demonstrated here; there are special algorithms for that (2xSAI etc); and also it won't be best for enlarging pictures - interpolation would win out there. But for shrinking pictures...?

Update 1: Hmm, just found out about supersampling. This seems like a variant of it, with a grid-type arrangement of samples, where the number of samples is optimized for the resolution of the source & target images.

5
Averaging is a very poor low pass filter (i.e. anti-aliassing filter) so I would expect results to be inferior to most other methods, where better filters are used.Paul R
@PaulR - even when downsampling?Vilx-
Yes, that's the whole point - ideally you need perfect filtering before you decimate, otherwise you get aliasing and/or loss of information. Averaging is effectively a very bad LPF which is not very flat in the passband and has large lobes in the stopband.Paul R
@PaulR - Wow... ok... I guess I'll need to go and read those books about signal & image processing after all. :PVilx-
You're right, you're not the first - I was using this technique 25 years ago, and I'm sure I wasn't the first either. I didn't have a name for it. It's easy to implement and optimizes very well. To see an example of how it fails vs. a better filter, resize a large image to one pixel smaller - the edges will be nice and crisp but the center will be blurred.Mark Ransom

5 Answers

3
votes

The algorithm you have stated is called an area-averaging algorithm, it is an algorithm which is seldom applied for shrinking images.

A simpler variant of it is used as an anti-aliasing technique for smoothing rendered images in computer games.

The algorithm for this technique is called Supersampling

enter image description here

Thanks to @Guffa for pointing it out, it is a simplification of the above algorithm, as it takes sample points and it could miss out on certain colors, or choose one color more times than another even though its not most dominant.
The algorithm above is equal to an infinite points sampling of the supersampling algorithm.

Update: Just noticed that even Java appreciates your algorithm :)
AreaAveragingScaleFilter

3
votes

I will start out by saying I don’t know an official name for your algorithm. I know that Paint Shop Pro called it “Bilinear” early on, but were forced to rename it to “Weighted Average” in version 8 when it was pointed out that the algorithm didn’t match the classic definition of Bilinear.

Most resizing algorithms can be applied in two independent passes, one in the X direction and one in the Y. This is not only more efficient, but it makes it a lot easier to describe and reason about the different algorithms. From this point forward I’m going to work in one dimension and assume you can extrapolate to 2D.

Your input consists of 7 pixels which we will give coordinates of 0, 1, 2, 3, 4, 5, 6. It’s useful to realize that a pixel is not a little square in this context, but is just a single point. To create the output you will want the interpolated values from the points 0.2, 1.6, 3.0, 4.4, 5.8. Why not 0.0, 1.5, 3.0, 4.5, 6.0? Suppose you doubled the size of the input and output to 14x14 and 10x10: the coordinates would now be 0.0, 1.44, 2.89, 4.33, 5.78, 7.22, 8.67, 10.11, 11.56, 13.0. Starting with the second pixel the results would be different, and that’s unacceptable. All the points should be 7/5 apart, giving the coordinates 0.2, 1.6, 3.0, 4.4, 5.8, 7.2, 8.6, 10.0, 11.4, 12.8.

Let’s compare the common resizing algorithms when expressed as a filter, and see how they compare to yours.

Nearest Neighbor filter

This first example in the generic form is called a Box or Averaging filter. But a magical thing happens when the width of the box filter is exactly 1.0: one pixel from the input is going to fall within the box and be given a weight of 1.0, and all the other pixels in the input will be given the weight 0.0. This makes it the equivalent of the Nearest Neighbor algorithm.

Bilinear filter

Our second example is generically called the Tent filter. Again it becomes something special when the width is exactly 2.0, it becomes a Linear interpolation; applied in 2D it’s called Bilinear.

Bicubic filter

The third example is a Cubic filter, which when applied in 2D is called Bicubic. There are different variations of this formula, this example uses the one suggested by Mitchell and Netravali.

Gaussian filter

While the Gaussian filter isn’t often used in resizing applications, I added it here for comparison.

Weighted Average filter

Finally we reach your algorithm. It’s a combination of averaging and bilinear, a tent with a flat top.

1
votes

Contrary to what I read in other answers, this algorithm is actually quite popular for supersampling, at least in the image processing community.

It is implemented in Intel's Performance Primitives library under the name Super Sampling; the (rather uninformative) name is a way to state that there is no alternative algorithm for supersampling in the library. In OpenCV, it goes under the name INTER_AREA; it is listed amongst other interpolation types, which could suggest they are interchangeable, but mention that "it may be a preferred method for image decimation" — a rather conservative statement to my taste.

When you supersample an image by an integer factor, say by a factor of two, taking the mean of the underlying pixels for the resulting image (as done for example by scikit-image's downscale_local_mean) is indeed optimal in a specific sense.

Suppose your image is obtained by some quantitative measurement of a signal by a grid of receptors. For example, photographs or X-rays, counting the number of photons. Pixel values are proportional to the amount of signal received by a given receptor.

If you assume that your camera is perfect — no spread of the signal, 100% coverage of the receiving area — then mean supersampling is optimal because it gives the exact image that would be received by a perfect camera with half the resolution.

Area averaging is a straightforward generalization of this optimal mean supersampling to non-integer ratios, which explains its popularity, although it cannot boast the same property for any supersampling ratio other than integers.

0
votes

Your description doesn't remind me of an algorithm most likely it's data structure and data type. It reminds me of a kd-tree or a quadtree. A kd-tree or quadtree can help you to solve your problem with nearest-neighbor. However if want a mathematical function for grids you can also take a look at space filling curves especially z order morton curves. This works great for power of 2 and reduce the 2d complexity. The function is f(x,y) = (f(x), f(y)).

-2
votes

It's also not bilinear

Yes, actually it is. Bilinear resampling is just linear interpolation in two dimensions.

You are describing it as getting the average color of the surface of the pixel, but interpolation from the surrounding pixels is just a much simpler way to calculate the same value.

The interpolation is done one dimension at a time, so you just calculate overlapping sides instead of overlapping areas, which is a lot simpler.