Compute camera and image pixel positions in 3D after OpenCV stereoRectify

Question

I have a complete stereo calibration with all results from (Python) OpenCV (i.e. necessary input and output of stereoRectify).

Visualization of stereo camera setup

My goal is to compute (for each of the two cameras) the camera center in world coordinates and the positions of arbitrary image coordinates (in pixels) in the world coordinate system after stereo rectification. I later want to determine the intersection of these perspective points (the rays going from the camera center through the image coordinates in the world coordinate system) with a plane in 3d that I computed in the world coordinate system.

For the unrectified camera, I can just use inverse translation and inverse rotation to transform points from the coordinate system of the right camera to the left camera (which I consider the world coordinate system). The transformation from 2d image coordinates to 3d in the camera coordinate system can be obtained by using the camera matrix.

However, after the rectification both cameras are rotated towards each other (to make them coplanar) and they are horizontally aligned using R_rect(both steps together are summarized in R1 and R2). Further, the camera matrix changes and we have new projection matrices P1 and P2. I am having trouble to revert these transformations.

Example: I have a point [u, v] in the rectified image of the right camera. I can transform this point into 3d (in the coordinate system of the rectified right camera) using the projection matrix P2. After this I obtain a point [X, Y, Z] in the camera coordinate system. How do I get the position of this point in the world coordinate system (i.e. the one from the unrectified left camera)?

Raj K Raj K · Accepted Answer · 2021-03-02T23:13:56

This is super late and I'm mostly answering in case it helps anyone in the future. This may not have been what OP was looking for, but a colleague and I calibrated a stereo camera and were trying to calculate the depth of the chessboard corners used in calibration. As we knew the precise pixel coordinates of the corners in each (left and right) image, no rectification or 1D search was necessary. However, since stereoRectify() returns the projection matrices for the imaginary, rectified cameras, we ended up manually creating P1 and P2 from K1, K2, R, and T:

P1 = np.hstack((K1, np.zeros(shape=(3, 1))))
P2 = [email protected]((R, T))

In our particular case, K1 and K2 were calculated individually for each camera using single-camera calibration (stereoCalibrate()), then passed to calibrateCamera(), which gave rotation & translation matrices R & T (as well as E and F).

It is my understanding that the last column of the projection matrix already takes into account the relationship between the two cameras of the stereo pair (often using the left camera center as the world origin), so using P2 for pixel [u2, v2] in the right camera should result in an XYZ with the same world origin as using P1 for corresponding pixel [u1, v1] in the left camera. Note (again), however, that this world origin using P1 and P2 from stereoRectify() is for an imaginary, rectified left camera, not the real one.

I am very much a novice, however, so please see Chapter 6 of Multiple View Geometry in Computer Vision, 2nd ed. by Richard Hartley and Andrew Zisserman for more details.

Compute camera and image pixel positions in 3D after OpenCV stereoRectify

1 Answers