Does homography hold between two images for a planar scene if camera translation is also in the Z direction?

Question

I am trying to compute relative pose between two images: and I am using homography to filter feature matches. I have a fairly planar scene, and homography based relative pose estimation works pretty accurately as long as the translation between the two images is restricted to only the X and Y axes (opencv convention).

Once I start moving in the Z direction with the other camera (the first camera stays constant), the relative pose estimation doesn't work properly, it keeps estimating the pose with very low Z translation. Does homography not apply when the translation is int he Z direction, although the scene is planar?

Attaching a picture here: I move the second camera in two squares: one in the XY plane, one in the XZ plane. The red crosses are the actual poses of the camera translating (consider it ground truth), and the blue circles are the relative poses estimated through a homography based RANSAC. Notice the accuracy when moving in X and Y, and complete failure in the Z direction: all the estimates are close to the z=0 plane.

My code for decomposing the homogrpahy matrix into rotation and translation was taken from this StackExchange answer

void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
    pose = Mat::eye(3, 4, CV_64FC1); //3x4 matrix
    float norm1 = (float)norm(H.col(0)); 
    float norm2 = (float)norm(H.col(1));
    float tnorm = (norm1 + norm2) / 2.0f;

    Mat v1 = H.col(0);
    Mat v2 = pose.col(0);

    cv::normalize(v1, v2); // Normalize the rotation

    v1 = H.col(1);
    v2 = pose.col(1);

    cv::normalize(v1, v2);

    v1 = pose.col(0);
    v2 = pose.col(1);

    Mat v3 = v1.cross(v2);  //Computes the cross-product of v1 and v2
    Mat c2 = pose.col(2);
    v3.copyTo(c2);      

    pose.col(3) = H.col(2) / tnorm; //vector t [R|t]
}

Is this accurate? Does the third column of the homography matrix encode full 3D translation?

... show your code please? It should work well, probably just some programming or logic error. — user202729
Added code snippet I used for decomposition. Actual filtering and finding the homography matrix was done through third party libraries (openMVG) — HighVoltage

Toby Collins Toby Collins · Accepted Answer · 2018-01-26T10:00:59

Does homography not apply when the translation is int he Z direction, although the scene is planar?

If you have a planar scene then all images of it using a perspective camera (with no lens distortion) will be related by homographies. This does not matter whether the camera is rotating or translating.
If there is significant lens distortion then the images will not be related by homographies.
If the scene is non-planar then the images will be related by homographies only if there is no lens distortion and no camera translation (just rotation).

The relative pose estimation doesn't work properly, it keeps estimating the pose with very low Z translation

The 3D translation computed using homography decomposition is up to scale. This means that the returned translation vector t between the two cameras differs from the true translation by a scale factor s. Unfortunately s is not recoverable. Typically 3D reconstructions from monocular images are called metric reconstructions for this reason (rather than Euclidean reconstructions where true scale is resolved). To resolve s some more information is needed, such as knowing the depth of a point on the plane or the distance moved by the camera between images.

Does homography hold between two images for a planar scene if camera translation is also in the Z direction?

2 Answers