I am trying to compute relative pose between two images: and I am using homography to filter feature matches. I have a fairly planar scene, and homography based relative pose estimation works pretty accurately as long as the translation between the two images is restricted to only the X and Y axes (opencv convention).
Once I start moving in the Z direction with the other camera (the first camera stays constant), the relative pose estimation doesn't work properly, it keeps estimating the pose with very low Z translation. Does homography not apply when the translation is int he Z direction, although the scene is planar?
Attaching a picture here: I move the second camera in two squares: one in the XY plane, one in the XZ plane. The red crosses are the actual poses of the camera translating (consider it ground truth), and the blue circles are the relative poses estimated through a homography based RANSAC. Notice the accuracy when moving in X and Y, and complete failure in the Z direction: all the estimates are close to the z=0 plane.
My code for decomposing the homogrpahy matrix into rotation and translation was taken from this StackExchange answer
void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
pose = Mat::eye(3, 4, CV_64FC1); //3x4 matrix
float norm1 = (float)norm(H.col(0));
float norm2 = (float)norm(H.col(1));
float tnorm = (norm1 + norm2) / 2.0f;
Mat v1 = H.col(0);
Mat v2 = pose.col(0);
cv::normalize(v1, v2); // Normalize the rotation
v1 = H.col(1);
v2 = pose.col(1);
cv::normalize(v1, v2);
v1 = pose.col(0);
v2 = pose.col(1);
Mat v3 = v1.cross(v2); //Computes the cross-product of v1 and v2
Mat c2 = pose.col(2);
v3.copyTo(c2);
pose.col(3) = H.col(2) / tnorm; //vector t [R|t]
}
Is this accurate? Does the third column of the homography matrix encode full 3D translation?