5
votes

I want to convert 2D Image coordinates to 3D world coordinates. I am using the ZED camera which is a stereo camera and the sdk shipped with it, provides the disparity map. Hence I have depth. The two cameras are parallel to each other. Although this problem is of trivial nature, I am unable to figure out the mathematics behind the conversion. I have the following information

1) I have the pixel information (i.e. the row and column number denoted by u and v) and the depth D at that pixel in meters.

2) I also know the focal length and cx, cy values of both the cameras

3) The distance between the two centers of projection B(baseline) is known.

I want to know how can one go from pixel(row, column, depth) in image to (X,Y,Z) in world coordinate.

Assume origin of world coordinate system as the central point between two cameras and vertically below on the ground. (The height at which camera is known as H).

Thank you.

2

2 Answers

5
votes

As you already know the depth D in meters for each pixel, you don't need the baseline B between cameras (which you would need to compute the depth from the disparity value). In fact, D is already your searched Z coordinate.

The general formula for the pinhole camera model (assuming there is no distortion) is:

u = fx * (X / Z) + cx
v = fy * (Y / Z) + cy

So it is then straightforward to compute the 3D-coordinates:

X = Z / fx * (u - cx)
Y = Z / fy * (v - cy)
[Z = D]

Note that this only is correct if you are working with a rectified image (or an image with low distortion).

0
votes

Just stumbled upon this. I am also using the ZED camera.

Just FYI for anyone interesting, the ZED APIs (SDK v. 1.2) provide a nice function for this: you can simply retrieve the XYZ map of all points on the image using

sl::zed::Mat xyz = zed->retrieveMeasure(sl::zed::XYZABGR);

Note sure if this is what you were after or not, but definitely something useful :)