Camera pose estimation (OpenCV PnP)

Question

I am trying to get a global pose estimate from an image of four fiducials with known global positions using my webcam.

I have checked many stackexchange questions and a few papers and I cannot seem to get a a correct solution. The position numbers I do get out are repeatable but in no way linearly proportional to camera movement. FYI I am using C++ OpenCV 2.1.

At this link is pictured my coordinate systems and the test data used below.

% Input to solvePnP():
imagePoints =     [ 481, 831; % [x, y] format
                    520, 504;
                   1114, 828;
                   1106, 507]
objectPoints = [0.11, 1.15, 0; % [x, y, z] format
                0.11, 1.37, 0; 
                0.40, 1.15, 0;
                0.40, 1.37, 0]

% camera intrinsics for Logitech C910
cameraMat = [1913.71011, 0.00000,    1311.03556;
             0.00000,    1909.60756, 953.81594;
             0.00000,    0.00000,    1.00000]
distCoeffs = [0, 0, 0, 0, 0]

% output of solvePnP():
tVec = [-0.3515;
         0.8928; 
         0.1997]

rVec = [2.5279;
       -0.09793;
        0.2050]
% using Rodrigues to convert back to rotation matrix:

rMat = [0.9853, -0.1159,  0.1248;
       -0.0242, -0.8206, -0.5708;
        0.1686,  0.5594, -0.8114]

So far, can anyone see anything wrong with these numbers? I would appreciate it if someone would check them in for example MatLAB (code above is m-file friendly).

From this point, I am unsure of how to get the global pose from rMat and tVec. From what I have read in this question, to get the pose from rMat and tVec is simply:

position = transpose(rMat) * tVec   % matrix multiplication

However I suspect from other sources that I have read it is not that simple.

To get the position of the camera in real world coordinates, what do I need to do? As I am unsure if this is an implementation problem (however most likely a theory problem) I would like for someone who has used the solvePnP function successfully in OpenCV to answer this question, although any ideas are welcome too!

Thank you very much for your time.

you forgot to inverse tVec. So the right way to do this is -transpose(rMat) * tVec — Vlad

Gouda Gouda · Accepted Answer · 2015-01-09T10:09:11

I solved this a while ago, apologies for the year delay.

In the python OpenCV 2.1 I was using, and the newer version 3.0.0-dev, I have verified that to get the pose of the camera in the global frame you must:

_, rVec, tVec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs)
Rt = cv2.Rodrigues(rvec)
R = Rt.transpose()
pos = -R * tVec

Now pos is the position of the camera expressed in the global frame (the same frame the objectPoints are expressed in). R is an attitude matrix DCM which is a good form to store the attitude in. If you require Euler angles then you can convert the DCM to Euler angles given an XYZ rotation sequence using:

roll = atan2(-R[2][1], R[2][2])
pitch = asin(R[2][0])
yaw = atan2(-R[1][0], R[0][0])

Camera pose estimation (OpenCV PnP)

3 Answers