Computing x,y coordinate (3D) from image point

Question

I have a task to locate an object in 3D coordinate system. Since I have to get almost exact X and Y coordinate, I decided to track one color marker with known Z coordinate that will be placed on the top of the moving object, like the orange ball in this picture: undistored

First, I have done the camera calibration to get intrinsic parameters and after that I used cv::solvePnP to get rotation and translation vector like in this following code:

std::vector<cv::Point2f> imagePoints;
std::vector<cv::Point3f> objectPoints;
//img points are green dots in the picture
imagePoints.push_back(cv::Point2f(271.,109.));
imagePoints.push_back(cv::Point2f(65.,208.));
imagePoints.push_back(cv::Point2f(334.,459.));
imagePoints.push_back(cv::Point2f(600.,225.));

//object points are measured in millimeters because calibration is done in mm also
objectPoints.push_back(cv::Point3f(0., 0., 0.));
objectPoints.push_back(cv::Point3f(-511.,2181.,0.));
objectPoints.push_back(cv::Point3f(-3574.,2354.,0.));
objectPoints.push_back(cv::Point3f(-3400.,0.,0.));

cv::Mat rvec(1,3,cv::DataType<double>::type);
cv::Mat tvec(1,3,cv::DataType<double>::type);
cv::Mat rotationMatrix(3,3,cv::DataType<double>::type);

cv::solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
cv::Rodrigues(rvec,rotationMatrix);

After having all matrices, this equation that can help me with transforming image point to wolrd coordinates:

transform_equation

where M is cameraMatrix, R - rotationMatrix, t - tvec, and s is an unknown. Zconst represents the height where the orange ball is, in this example it is 285 mm. So, first I need to solve previous equation, to get "s", and after I can find out X and Y coordinate by selecting image point: equation2

Solving this I can find out variable "s", using the last row in matrices, because Zconst is known, so here is the following code for that:

cv::Mat uvPoint = (cv::Mat_<double>(3,1) << 363, 222, 1); // u = 363, v = 222, got this point using mouse callback

cv::Mat leftSideMat  = rotationMatrix.inv() * cameraMatrix.inv() * uvPoint;
cv::Mat rightSideMat = rotationMatrix.inv() * tvec;

double s = (285 + rightSideMat.at<double>(2,0))/leftSideMat.at<double>(2,0)); 
//285 represents the height Zconst

std::cout << "P = " << rotationMatrix.inv() * (s * cameraMatrix.inv() * uvPoint - tvec) << std::endl;

After this, I got result: P = [-2629.5, 1272.6, 285.]

and when I compare it to measuring, which is: Preal = [-2629.6, 1269.5, 285.]

the error is very small which is very good, but when I move this box to the edges of this room, errors are maybe 20-40mm and I would like to improve that. Can anyone help me with that, do you have any suggestions?

Have you applied undistortion to your images with the intrinsic parameters ? Especially at the edges of an image the distortion can be quite high with any COTS lens which could be at least one reason why you're getting this error. — count0
@count0 Yes, I did, first I load the cameraMatrix and distCoeffs from xml files and then I apply this cv::undistort(inputImage,undistorted,cameraMatrix,distCoeffs); After with mouse callback I select the orange ball and store the values in uvPoint. — Banana
I think with a COTS camera and a room with a span of a few meters an error of 2-4 cm is what you'll have to live with. It's actually quite good for a system like that. To deal with real 3d you'll have to use a multi-view camera system anyways, and due to disparity error objects further away from your camera will have a higher error. So the answer here is: use multiple measurements of whatever kind. — count0
Incredibly useful example, helped me solve my projection issues — Reece Sheppard

Pascal Lécuyot Pascal Lécuyot · Accepted Answer · 2012-09-06T13:54:21

Given your configuration, errors of 20-40mm at the edges are average. It looks like you've done everything well.

Without modifying camera/system configuration, doing better will be hard. You can try to redo camera calibration and hope for better results, but this will not improve them alot (and you may eventually get worse results, so don't erase actual instrinsic parameters)

As said by count0, if you need more precision you should go for multiple measurements.

Computing x,y coordinate (3D) from image point

2 Answers