1
votes

I am trying to determine camera pose based on fiducial marker found in a scene.

Fiducial: http://tinypic.com/view.php?pic=4r6k3q&s=8#.VNLnWTVVK1E

Current process:

  1. Use SIFT for feature detection
  2. Use SIFT for descriptor extraction
  3. Use FLANN for matching
  4. Find the homography using CV_RANSAC
  5. Identify the corners of the fiducial
  6. Identify corners of the fiducial in the scene using perspectiveTransform()
  7. Draw lines around corners (i.e. prove that it found the fiducial in the scene
  8. Run camera calibration
  9. Load calibration results (cameraMatrix & distortionCoefficients)

Now I am trying to figure out the camera pose. I attempted to use:

void solvePnP(const Mat& objectPoints, const Mat& imagePoints, const Mat& cameraMatrix, const Mat& distCoeffs, Mat& rvec, Mat& tvec, bool useExtrinsicGuess=false)

where:

  • obectPoints are the fiducial corners
  • imagePoints are the fiducial corners in the scene
  • cameraMatrix is from calibration
  • distCoeffs is from calibration
  • rvec and tvec should be returned to me from this function

However, when I run this, I get a core dump error, so I am not sure what I am doing incorrectly.

I haven't found very good documentation on solvePNP() - did I misunderstand the function or input parameters?

Appreciate your help

Update Here is my process:

OrbFeatureDetector detector; //Orb seems more accurate than SIFT
vector<KeyPoint> keypoints1, keypoints2; 

detector.detect(marker_im, keypoints1);
detector.detect(scene_im, keypoints2);

Mat display_marker_im, display_scene_im;
drawKeypoints(marker_im, keypoints1, display_marker_im, Scalar(0,0,255));
drawKeypoints(scene_im, keypoints2, display_scene_im, Scalar(0,0,255));

SiftDescriptorExtractor extractor;
Mat descriptors1, descriptors2;

extractor.compute( marker_im, keypoints1, descriptors1 );
extractor.compute( scene_im, keypoints2, descriptors2 );

BFMatcher matcher; //BF seems to match better than FLANN
vector< DMatch > matches;
matcher.match( descriptors1, descriptors2, matches );

Mat img_matches;
drawMatches( marker_im, keypoints1, scene_im, keypoints2,
    matches, img_matches, Scalar::all(-1), Scalar::all(-1),
    vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );

vector<Point2f> obj, scene;
for (int i = 0; i < matches.size(); i++) {
    obj.push_back(keypoints1[matches[i].queryIdx].pt);
    scene.push_back(keypoints2[matches[i].trainIdx].pt);
}

Mat H;
H = findHomography(obj, scene, CV_RANSAC);

//Get corners of fiducial
vector<Point2f> obj_corners(4);
obj_corners[0] = cvPoint(0,0);
obj_corners[1] = cvPoint(marker_im.cols, 0);
obj_corners[2] = cvPoint(marker_im.cols, marker_im.rows);
obj_corners[3] = cvPoint(0, marker_im.rows);
vector<Point2f> scene_corners(4);

perspectiveTransform(obj_corners, scene_corners, H);

FileStorage fs2("cal.xml", FileStorage::READ);

Mat cameraMatrix, distCoeffs;
fs2["Camera_Matrix"] >> cameraMatrix;
fs2["Distortion_Coefficients"] >> distCoeffs;

Mat rvec, tvec;

//same points as object_corners, just adding z-axis (0)
vector<Point3f> objp(4);
objp[0] = cvPoint3D32f(0,0,0);
objp[1] = cvPoint3D32f(gray.cols, 0, 0);
objp[2] = cvPoint3D32f(gray.cols, gray.rows, 0);
objp[3] = cvPoint3D32f(0, gray.rows, 0);

solvePnPRansac(objp, scene_corners, cameraMatrix, distCoeffs, rvec, tvec );

Mat rotation, viewMatrix(4, 4, CV_64F);
Rodrigues(rvec, rotation);

for(int row=0; row<3; ++row)
{
   for(int col=0; col<3; ++col)
   {
      viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
   }
   viewMatrix.at<double>(row, 3) = tvec.at<double>(row, 0);
}

viewMatrix.at<double>(3, 3) = 1.0f;

cout << "rotation: " << rotation << endl;
cout << "viewMatrix: " << viewMatrix << endl;
1

1 Answers

1
votes

Okay, so solvePnP() gives you the transfer matrix from the model's frame (ie the cube) to the camera's frame (it's called view matrix).

Input parameters:

  • objectPoints – Array of object points in the object coordinate space, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points. std::vector<cv::Point3f> can be also passed here. The points are 3D, but since they are in a pattern coordinate system (of the fiducial marker), then the rig is planar so that Z-coordinate of each input object point is 0,
  • imagePoints – Array of corresponding image points, 2xN/Nx2 1-channel or 1xN/Nx1 2-channel, where N is the number of points. std::vector<cv::Point2f> can be also passed here,
  • intrinsics: camera matrix (focal length, principal point),
  • distortion: distortion coefficients, zero distortion coefficients are assumed if it is empty,
  • rvec: output rotation vector
  • tvec: output translation vector

Building of the view matrix is something like this:

cv::Mat rvec, tvec;
cv::solvePnP(objectPoints, imagePoints, intrinsics, distortion, rvec, tvec);
cv::Mat rotation, viewMatrix(4, 4, CV_64F);
cv::Rodrigues(rvec, rotation);

for(int row=0; row<3; ++row)
{
   for(int col=0; col<3; ++col)
   {
      viewMatrix.at<double>(row, col) = rotation.at<double>(row, col);
   }

   viewMatrix.at<double>(row, 3) = tvec.at<double>(row, 0);
}

viewMatrix.at<double>(3, 3) = 1.0f;

Furthermore, can you share your code and error message?