Proposed strategy
You could project the vertices from your mesh into 2D pixel coordinates (using your calibrated camera model). Then for each face, you can find all of the pixel centres (lattice points) contained in the 2D triangle formed by its projected vertices. You may have to keep track of which triangle is the nearest in the case of overlap. Now you know which face corresponds to each pixel. This should be very fast unless your mesh is much higher resolution than your image.
You can then find the 3D ray corresponding to each pixel using the camera model, and intersect the ray with the known face for that pixel to calculate the depth (sounds like you already did this part). This shouldn't take too long either, now that you know the plane.
More info on the camera projection
OpenCV has a good resource on using the camera model (below).
Basically, you can project 3D point M'
to pixel coordinate m'
; this is how you project your vertices to pixel positions. Going the other direction, scale is unrecoverable -- you get the ray M'/s
rather than the point M'
. The depth you're looking for is s
, which is the 3D point's Z
coordinate in the camera frame. If your mesh is in a camera-centric frame (X right, Y down, Z out), R = Identity
and t = 0
. If it's not, [R|t]
transforms it to be.
Expanding each factor lets us see the makeup of the matrices.
The code that you suggested below uses OpenCV's projectPoints
function, which implements the above equation plus some distortion calibration (see main OpenCV reference). You have to populate the matrices and it multiplies them. An alternative example for projectPoints
is available on GitHub, and I believe this same example is discussed in this SO question.
Code suggested by asker
Apparently the following code does the job. I may need some time to
pick through it given that my C++ knowledge is practically zero (I
realise that it is commented out BTW):
//CString str;
//cv::Mat CamMatrix(3, 3, CV_64F);
//cv::Mat distCoeffs(5, 1, CV_64F);
//m_CamCalib.GetOpenCVInfo(&CamMatrix, &distCoeffs);
//vector<Point3d> GCP_Points;
//vector<Point2d> Image_Points;
//cv::Mat RVecs(3, 3, CV_64F); // rotation matrix
//cv::Mat TranRVecs(3, 3, CV_64F); // rotation matrix
//cv::Mat TVecs(3, 1, CV_64F); // translation vector
//RVecs.at<double>(0, 0) = m_CamPosMtrx.m_pMtrx[0];
//RVecs.at<double>(1, 0) = m_CamPosMtrx.m_pMtrx[1];
//RVecs.at<double>(2, 0) = m_CamPosMtrx.m_pMtrx[2];
//RVecs.at<double>(0, 1) = m_CamPosMtrx.m_pMtrx[4];
//RVecs.at<double>(1, 1) = m_CamPosMtrx.m_pMtrx[5];
//RVecs.at<double>(2, 1) = m_CamPosMtrx.m_pMtrx[6];
//RVecs.at<double>(0, 2) = m_CamPosMtrx.m_pMtrx[8];
//RVecs.at<double>(1, 2) = m_CamPosMtrx.m_pMtrx[9];
//RVecs.at<double>(2, 2) = m_CamPosMtrx.m_pMtrx[10];
//transpose(RVecs, TranRVecs);
//TVecs.at<double>(0, 0) = 0;
//TVecs.at<double>(1, 0) = 0;
//TVecs.at<double>(2, 0) = 0;
//GCP_Points.push_back(Point3d((x - m_CamPosMtrx.m_pMtrx[12]), (y - m_CamPosMtrx.m_pMtrx[13]), (z - m_CamPosMtrx.m_pMtrx[14])));
//Image_Points.push_back(Point2d(0, 0));
//projectPoints(GCP_Points, TranRVecs, TVecs, CamMatrix, distCoeffs, Image_Points);
/bool CCameraCalibration::GetOpenCVInfo(Mat * cameraMatrix, Mat * distCoeffs)
//{
// int i,j;
// Mat projMatrix;
// CMatrix4x4 m1;
// if(cameraMatrix->rows==0) cameraMatrix->create(3,3, CV_64F);
// if(distCoeffs->rows==0) distCoeffs->create(5, 1, CV_64F);
// for(i=0;i<3;i++)
// for(j=0;j<3;j++){
// cameraMatrix->at<double>(i,j)=m_pCameraMatrix[i][j];
// }
// for(i=0;i<5;i++)
// distCoeffs->at<double>(i,0)=m_pCoefficients[i];
// return false;
//}