4
votes

My problem is quite simple yet I struggle to solve it correctly.

I have a camera looking towards the ground and I know all the parameters of the shot. So, using some maths I was able to compute the 4 points defining the field of view of the camera (the coordinates on the ground of each image's corners).

Now, from the coordinates (x, y) of a pixel of the image, I would like to know its real coordinates projected on the ground.

I thought that homography was the way to go, but I read here and there that "homography maps a plane seen from a camera to the same plane seen from another" which is a slightly different problem.

What should I use, please?


Edit: Here is an example.

Given this image: Ground

I know everything about the camera that took the picture (height, angles of view, orientation), so I could calculate the coordinates of the four corners forming its field of view on the ground, for example (in centimeters, relative to the camera position, clockwise from top-left): (-300, 500), (300, 500), (100, 50), (-100, 50).

Knowing that the coordinates on the image of the blade of grass are (1750, 480), how can I know its actual coordinates on the ground?

1
Is it possible to see a sample image? It seems that homography can still be useful because the plane is always the same, the ground. - UJIN
@UJIN I updated my question with an example. - Delgan
It should be possible to find the coordinates of the grass in the "projected" plane, but I can't wrap my head around it now. I will try as soon as I have some spare time. I mean, you have 4 points in one coordinate system, and 4 points in another. It should be possible to use the 8-point algorithm, find a homography, and then project points from the photo to the actual ground. But at this point I may be all wrong :/ - UJIN
@UJIN Thank you for your time! I think I will go with an simple homography then. - Delgan

1 Answers

2
votes

By "knowing everything" about the camera, do you mean you have the camera FOV, rotation and translation with respect to the ground plane? Then it's trivial, right?

Write the camera matrix K = [[f, 0, w/2],[0, f, h/2],[0, 0, 1]]. Let R and t be respectively the 3x3 rotation matrix and 3x1 translation from camera to ground. A point on the ray going through a given pixel p=[u, v, 1] has camera coordinates r = inv(K) * p. Express it in world coordinates as R * r + t, intersect with the ground plane and you are done.