5
votes

Over the past weeks I've tried learning to rectify images and with the help of the people here I've managed to understand it better. About a week ago I set up a test example which I wanted to rectify (view the image from above). This worked fine (original: http://sitedezign.net/original.jpg and rectified: http://sitedezign.net/rectified.jpg) with the function T = cv2.getPerspectiveTransform(UV_cp, XYZ_gcp) where T becomes the Homography.

When I tried to do this with real world photos it failed because the realworld coordinates aren't perfectly on a plane (but simply around 10 control points measured in X, Y and Z coordinates in space). Therefor I decided to use solvePnP and hopefully be able to create a Homography which I can use.

I tried this on the test example but didn't get the results I expected: the image isn't rectified, and my Homography calculated using solvePnP isn't equal to the Homography calculated with getPerspectiveTransform.

My code:

# Set UV (image) and XYZ (real life)
UV_cp = np.array([[1300.0, 2544.0], # left down
                  [1607.0, 1000.0], # left up
                  [3681.0, 2516.0], # right down
                  [3320.0, 983.0]], np.float32) # right up

# Z is on 0 plane, so Z=0.0
XYZ_gcp = np.array([[0.0, 400.0, 0.0],
                    [0.0, 0.0, 0.0],
                    [300.0, 400.0, 0.0],
                    [300.0, 0.0, 0.0]], np.float32)

rvec, tvec = cv2.solvePnP(XYZ_gcp, UV_cp, K, D)
rotM_cam = cv2.Rodrigues(rvec)[0]

# calculate camera position (= translation), in mm from 0,0,0 point
cameraPosition = -np.matrix(rotM_cam).T * np.matrix(tvec)

# 3x3 Identity matrix
I = np.identity(3)

# [I|-C]
I1_extended = np.hstack((I,-cameraPosition))

# P = K*R*I
P_cam = K.dot(rotM_cam).dot(I1_extended)

# create P2 = image from above: R = 0,0,0, translation = x, y, z = 0,0,-1000 (mm)
R_rec = matr.getR(0.0,0.0,0.0)
newZ = -1000.0
new_cameraPosition = np.array([[0.0],[0.0],[newZ]])
I2_extended = np.hstack((I,new_cameraPosition))
P_rec = K.dot(R_rec).dot(I2_extended)

# correct Homography T from getPerspectiveTransform:
T = np.array([[4.70332834e-01, 9.35182514e-02, -4.24671558e+02],
              [9.62104844e-03, 9.69462117e-01, -4.92461571e+02],
              [3.54859924e-06, 6.80081146e-04, 1.00000000e+00]])

# Homography Matrix = H = P_rect * pinv(P) => P2 * pinv(P1)
H = P_rec.dot(np.linalg.pinv(P_cam))

The result is a warped image, which is far from equal to the one shown above (the rectified one). Also the Homography T which should be correct (from getPerspectiveTransform) is not close to equal to the homography calculated using the result from solvePnP (H).

H from solvePnP:
[[  1.01865631e+00   2.68683332e-01  -2.04519580e+03]
 [ -3.24304366e-02   6.82672680e-01  -1.15688010e+03]
 [  2.03399902e-05   1.24191993e-04  -5.41378561e-01]]

H from getPerspectiveTransform:
[[  4.70332834e-01   9.35182514e-02  -4.24671558e+02]
 [  9.62104844e-03   9.69462117e-01  -4.92461571e+02]
 [  3.54859924e-06   6.80081146e-04   1.00000000e+00]]

Anyone has an idea what is going wrong?

PS: The code to determine the K matrix and distortion coefficients (values are taken from my camera Pentax K-5 at 33mm focal length according to Adobe Camera Raw):

# Focal length, sensor size (mm and px)
f = 33.0 # mm
pix_width = 4928.0 # sensor size has 4928px in width
pix_height = 3624.0 # sensor size has 4928px in width
sensor_width = 23.7 # mm
sensor_height = 15.7 # mm

# set center pixel
u0 = int(pix_width / 2.0)
v0 = int(pix_height / 2.0)

# determine values of camera-matrix
mu = pix_width / sensor_width # px/mm
alpha_u = f * mu # px

mv = pix_height / sensor_height # px/mm
alpha_v = f * mv # px

# Distortion coefs 
D = np.array([[0.0, 0.0, 0.0, 0.0]])

# Camera matrix
K = np.array([[alpha_u, 0.0, u0],
              [0.0, alpha_v, v0],
              [0.0, 0.0, 1.0]])
1
Can you post the code sample corresponding to the call to solvePnP and how you get H from that ?BConic
I editted my post so it's a bit easier to read. The code to create the Homography is in the first block of code. Simply what I do: 1) set UV points in image 2) set XYZ points in real world 3) use K (camera) matrix and D (distortion coefficients) for solvePnP 4) use the result to get the Rotation Matrix and translation vector (which are almost perfectly correct (checked the values with the real world measurements) 5) created another Plane matrix (P2) woth rotation: 0,0,0 and translation: 0,0,-1000 (so 1000mm above the surface) created the homography using P_rect*pinv(P_cam)Yorian
@Yorian I tried the code you posted above but I get an error saying rvec,tvec = cv2.solvePnP(XYZ_gcp, UV_cp, K, D) ValueError : too many values to unpack. What should I do? Also, can you tell me how to find XYZ points in real world?Clive
Ok I solved my problem. Use retval, rvec, tvec = cv2.solvePnP(XYZ_gcp, UV_cp, K, D). The retval was missing in your code posted above. solvePnP returns 3 parameters and not 2.Clive

1 Answers

1
votes

Your K matrix seems appropriate, but this might not be sufficient to achieve good accuracy with real images. I think instead of giving a priori reasonable values (in particular for the optical center pixel & lens distortion coefficients), you should calibrate your camera using the calibrateCamera function (documentation link, tutorials). However, I don't think the problem you are mentionning is caused by that.

I think your problem comes from the definition of P_rec.

First, be aware that if you use newZ = -1000.0, you are actually translating the camera by 1000 meters (not millimeters).

Second, you have to be very careful what 3D points you are considering and where you want them to be projected in the image:

  1. As you used XYZ_gcp in the solvePnP function, this means that you are using these coordinates as 3D points.

  2. As you used XYZ_gcp in the getPerspectiveTransform function, this means that you are also using these as 2D coordinates. Notice that, strictly speaking, you cannot do this because getPerspectiveTransform expects two 4x2 arrays (not a 4x2 and a 4x3), but I'll assume you dropped the third coordinates which are always 0.

Hence, your P_rec should be defined so that [x; y; 1] = P_rec * [x; y; 0; 1]. Therefore, P_rec should be defined as follows:

P_rec = [ [1 0 0 0] [0 1 0 0] [0 0 0 1] ].