3
votes

I know the cv::8UC1 or cv::8UC3 formats for images, those are 1-channel 0-255 unsigned char or 3-channel for RGB 0-255 unsigned char image formats to store respectively.

But what about floating point formats like cv::32F? I have two questions about these guys.

1-Why they are 32-bit? I mean why we need to store intensity of a pixel in range of 0-4294967296 ? (Then I think we must convert it again to 8-bit numbers in 0-255 range!)

2-As I know we use a 2-dimensional matrix to store an image that has an integer coordinates(x,y). But I saw in some codes they use floating point numbers to find a pixel's location/coordinates. Like this part of O'reilly's book:

Subpixel corners If you are processing images for the purpose of extracting geometric measurements, as opposed to extracting features for recognition, then you will normally need more resolution than the simple pixel values supplied by cv::goodFeaturesToTrack() . Another way of saying this is that such pixels

come with integer coordinates whereas we sometimes require real-valued coordinates — for example, a pixel location of (8.25, 117.16).

What is the meaning of this?

1
Your thinking about very simple images, and very simple maths. I don't have time to write an answer but checkout here - GPPK
1 reason is that you apply math formulas to the image (e.g. gradient) which generates float images which gives values that means nothing colorwise, but has more sense mathwise. For the location of the pixels is almost the same thing... it is to have more accurate results, not the same to add 8.25 to 8.25 and round it (17) that to have 8 + 8 which gives 16.... - api55
@GPPK *You're. Sorry but I can't stomach that here on SO. - zindarod
@zindarod i'd update but i totally can't - GPPK

1 Answers

3
votes

I'll try to make a short explanation, but I'm sure you'll get to understand it better with time anyhow. When we capture an image in a digital way, we actually use a table-like sensor where we measure the amount of light that hits this cell. Light is a physical quantity. the fact that you limit the measured values to 0..255 does not mean that our eye is not capable of seeing more delicate division. Therefore many applications indeed give values of 0...2^16-1 instead of 0...2^8-1 (which is 255). Now, I think you know enough for answering your question :

  1. Imagine I have an image of a farm taken from an airplane. I know that there is a farmer somewhere in this picture. I want to store in each pixel the probability that this pixel is the farmer (people are very small when shot from an airplane). probability ranges from 0 to 1. therefore I need floating point numbers. If I do a high precision calculation, I might need not only a 32-bit floating point but rather a 64-bit floating point for gaining more precision.

  2. As mentioned above, we capture only a "table of pixels" but we assume that the reality is more complex. As it indeed is. So the farmer might "fall in" between two pixels. so, we can take a different approach: try to find the Tractor upon which the farmer is traveling, and take the middle of it. Which can fairly fall between two pixels.

Read on, with practice you'll get the hang of it.