6
votes

I'm working on an OpenCV program to find the distance from the camera to a rectangle with a known aspect ratio. Finding the distance to a rectangle as seen from a forward-facing view works just fine:

forward-facing

The actual distance is very close to the distance calculated by this:

            wtarget · pimage
d = c ——————————————————————————
       2 · ptarget · tan(θfov / 2)

Where wtarget is the actual width (in inches) of the target, pimage is the pixel width of the overall image, ptarget is the length of the largest width (in pixels) of the detected quadrilateral, and θfov is the FOV of our webcam. This is then multiplied by some constant c.

The issue occurs when the target rectangle is viewed from a perspective that isn't forward-facing:

not forward-facing

The difference in actual distance between these two orientations is small, but the detected distance differs by almost 2 feet.

What I'd like to know is how to calculate the distance consistently, accounting for different perspective angles. I've experimented with getPerspectiveTransform, but that requires me to know the resulting scale of the target - I only know the aspect ratio.

1
Possible to get you to Spell Out Abbreviations When First Used (SOAWFU)? Whats a FOV? Is that the angle of the Field of View of the camera lens? Is it possible to normalize the skewed box to the camera lens, and calculate distance that way? (convert perspective box to pure rectangle using mid points, then determine distance using a normal view calculation?)zipzit
@zipzit Yes, it's the horizontal field of view of the camera. It seems that using the midpoints creates a parallelogram. Not sure what I'd do with this parallelogram though.August
Here's an awesome answer with demo related to performing the calculations for css implementation: How to match 3D perspective of real photo and object in CSS3 3D transformsashleedawg

1 Answers

0
votes

Here's what you know:

  1. The distance between the top left and top right corners in inches (w_target)
  2. The distance between those corners in pixels on a 2D plane (p_target)

So the trouble is that you're not accounting for the shrinking distance of p_target when the rectangle is at an angle. For example, when the rectangle is turned 45 degrees, you'll lose about half of your pixels in p_target, but your formula assumes w_target is constant, so you overestimate distance.

To account for this, you should estimate the angle the box is turned. I'm not sure of an easy way to extract that information out of getPerspectiveTransform, but it may be possible. You could set up a constrained optimization where the decision variables are distance and skew angle and enforce a metric distance between the left and right points on the box.

Finally, no matter what you are doing, you should make sure your camera is calibrated. Depending on your application, you might be able to use AprilTags to just solve your problem.