15
votes

I am trying to calculate scale, rotation and translation between two consecutive frames of a video. So basically I matched keypoints and then used opencv function findHomography() to calculate the homography matrix.
homography = findHomography(feature1 , feature2 , CV_RANSAC); //feature1 and feature2 are matched keypoints

My question is: How can I use this matrix to calculate scale, rotation and translation?.
Can anyone provide me the code or explanation as to how to do it?

6
the keyword is "homography decomposition". Afair you can extract the rotation with a QR decomposition, but you should better google that...Micka
maybe this one (or its links) will help: stackoverflow.com/questions/15420693/…Micka
This is a complex problem, but this answer explains it well: stackoverflow.com/questions/7388893/… You should try to get a deeper understanding of how the Homography matrix works. By doing so you'll also learn the pros and cons. You should also look into other kinds of transforms as affine transform and rigid transform. If they can solve your problem, they are easier to use.Øystein W.

6 Answers

2
votes

The right answer is to use homography as it is defined dst = H ⋅ src and explore what it does to small segments around a particular point.

Translation

Given a single point, for translation do

T = dst - (H ⋅ src)

Rotation

Given two points p1 and p2

p1 = H ⋅ p1

p2 = H ⋅ p2

Now just calculate the angle between vectors p1 p2 and p1' p2'.

Scale

You can use the same trick but now just compare the lengths: |p1 p2| and |p1' p2'|.

To be fair, use another segment orthogonal to the first and average the result. You will see that there is no constant scale factor or translation one. They will depend on the src location.

1
votes

Given Homography matrix H:

    |H_00, H_01, H_02|
H = |H_10, H_11, H_12|
    |H_20, H_21, H_22|

Assumptions:

H_20 = H_21 = 0 and normalized to H_22 = 1 to obtain 8 DOF.

The translation along x and y axes are directly calculated from H:

tx = H_02
ty = H_12

The 2x2 sub matrix on the top left corner is decomposed to calculate shear, scaling and rotation. An easy and quick decomposition method is explained here.

Note: this method assumes invertible matrix.

0
votes

For estimating a tree-dimensional transform and rotation induced by a homography, there exist multiple approaches. One of them provides closed formulas for decomposing the homography, but they are very complex. Also, the solutions are never unique.

Luckily, OpenCV 3 already implements this decomposition (decomposeHomographyMat). Given an homography and a correctly scaled intrinsics matrix, the function provides a set of four possible rotations and translations.

0
votes

The question seems to be about 2D parameters. Homography matrix captures perspective distortion. If the application does not create much perspective distortion, one can approximate a real world transformation using affine transformation matrix (that uses only scale, rotation, translation and no shearing/flipping). The following link will give an idea about decomposing an affine transformation into different parameters.

https://math.stackexchange.com/questions/612006/decomposing-an-affine-transformation

0
votes

Since i had to struggle for a couple of days to create my homography transformation function I'm going to put it here for the benefit of everyone.

Here you can see the main loop where every input position is multiplied by the homography matrix h. Then the result is used to copy the pixel from the original position to the destination position.

    for (tempIn[0] = 0; tempIn[0] < stride; tempIn[0]++)
    {
        for (tempIn[1] = 0; tempIn[1] < rows; tempIn[1]++)
        {
            double w = h[6] * tempIn[0] + h[7] * tempIn[1] + 1; // very important!
            //H_20 = H_21 = 0 and normalized to H_22 = 1 to obtain 8 DOF. <-- this is wrong

            tempOut[0] = ((h[0] * tempIn[0]) + (h[1] * tempIn[1]) + h[2])/w;
            tempOut[1] =(( h[3] * tempIn[0]) +(h[4] * tempIn[1]) + h[5])/w;


            if (tempOut[1] < destSize && tempOut[0] < destSize && tempOut[0] >= 0 && tempOut[1] >= 0)
                dest_[destStride * tempOut[1] + tempOut[0]] = src_[stride * tempIn[1] + tempIn[0]];
        }
    }

After such process an image with some kind of grid will be produced. Some kind of filter is needed to remove the grid. In my code i have used a simple linear filter.

Note: Only the central part of the original image is really required for producing a correct image. Some rows and columns can be safely discarded.