I am currently experimenting with ORB SLAM 2 and a stereo camera like this. I am using 2.8mm and optionally 3.6mm lenses with a resolution of 640x480 pixels for the left and right camera/image.
ORB SLAM 2 lets me define several distortion/rectifying parameters withing the settings file (*.yaml), such as:
fx, fy, cx, cy
k1, k2, p1, p2
I conducted the OpenCV camera calibration using a chessboard like described here (9x7 inner corners and 70mm square length). Later on I used this automated calibration program from MRPT which gives me the same results with less stumbling blocks.
However, ORB SLAM 2 lets me define these additional parameters for pre-rectifying the images (if I understand this correctly):
D: 1x5 Matrix -> Distortion Coefficients aquired from calibration (fx,fy,cx,cy) ?
K: 3x3 Matrix -> Intrinsic Matrix aquired from calibration (k1,k2,p1,p2,k3) ?
R: 3x3 Matrix -> Rectification Transformation ?
P: 3x4 Matrix -> New Projection Matrix ?
My questions are the following (see below for an example settings.yaml file):
A.) Is my assumption correct, that D
are the distortion coefficients
and K
is the intrinsic matrix
acquired from the checkboard calibration procedure ?
B.) Is defining fx
, fy
, cx
, cy
in settings.yaml
sufficient for pre-rectifying the images and successful operation of ORB SLAM 2 ?
C.) Do I need R
and P
matrices for successful operation of ORB SLAM 2 ?
D.) How can I acquired the R
and P
matrices? The OpenCV camera calibration procedure with the checkboard does not provide me these matrices, correct ?
Here's an example of the above mentioned settings.yaml
file of ORB SLAM 2:
%YAML:1.0
#--------------------------------------------------------------------------------------------
# Camera Parameters. Adjust them!
#--------------------------------------------------------------------------------------------
# Camera calibration and distortion parameters (OpenCV)
Camera.fx: 646.53807309613160
Camera.fy: 647.36136487241527
Camera.cx: 320.94123353073792
Camera.cy: 219.07092188981900
Camera.k1: -0.43338537102343577
Camera.k2: 0.46801812273859494
Camera.p1: 0.0039978632628183738
Camera.p2: 0.00023265675941025371
Camera.width: 640
Camera.height: 480
# Camera frames per second
Camera.fps: 20.0
# stereo baseline times fx
Camera.bf: 38.76
# Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)
Camera.RGB: 1
# Close/Far threshold. Baseline times.
ThDepth: 50
#--------------------------------------------------------------------------------------------
# Stereo Rectification. Only if you need to pre-rectify the images.
# Camera.fx, .fy, etc must be the same as in LEFT.P
#--------------------------------------------------------------------------------------------
LEFT.width: 640
LEFT.height: 480
LEFT.D: !!opencv-matrix
rows: 1
cols: 5
dt: d
data:[-0.28340811, 0.07395907, 0.00019359, 1.76187114e-05, 0.0]
LEFT.K: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [458.654, 0.0, 367.215, 0.0, 457.296, 248.375, 0.0, 0.0, 1.0]
LEFT.R: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [0.999966347530033, -0.001422739138722922, 0.008079580483432283, 0.001365741834644127, 0.9999741760894847, 0.007055629199258132, -0.008089410156878961, -0.007044357138835809, 0.9999424675829176]
LEFT.P: !!opencv-matrix
rows: 3
cols: 4
dt: d
data: [435.2046959714599, 0, 367.4517211914062, 0, 0, 435.2046959714599, 252.2008514404297, 0, 0, 0, 1, 0]
RIGHT.width: 640
RIGHT.height: 480
RIGHT.D: !!opencv-matrix
rows: 1
cols: 5
dt: d
data:[-0.28368365, 0.07451284, -0.00010473, -3.555907e-05, 0.0]
RIGHT.K: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [457.587, 0.0, 379.999, 0.0, 456.134, 255.238, 0.0, 0.0, 1]
RIGHT.R: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [0.9999633526194376, -0.003625811871560086, 0.007755443660172947, 0.003680398547259526, 0.9999684752771629, -0.007035845251224894, -0.007729688520722713, 0.007064130529506649, 0.999945173484644]
RIGHT.P: !!opencv-matrix
rows: 3
cols: 4
dt: d
data: [435.2046959714599, 0, 367.4517211914062, -47.90639384423901, 0, 435.2046959714599, 252.2008514404297, 0, 0, 0, 1, 0]
#--------------------------------------------------------------------------------------------
# ORB Parameters
#--------------------------------------------------------------------------------------------
# ORB Extractor: Number of features per image
ORBextractor.nFeatures: 800
# ORB Extractor: Scale factor between levels in the scale pyramid
ORBextractor.scaleFactor: 1.2
# ORB Extractor: Number of levels in the scale pyramid
ORBextractor.nLevels: 8
# ORB Extractor: Fast threshold
# Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.
# Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST
# You can lower these values if your images have low contrast
ORBextractor.iniThFAST: 12
ORBextractor.minThFAST: 3
#--------------------------------------------------------------------------------------------
# Viewer Parameters
#--------------------------------------------------------------------------------------------
Viewer.KeyFrameSize: 0.05
Viewer.KeyFrameLineWidth: 1
Viewer.GraphLineWidth: 0.9
Viewer.PointSize:2
Viewer.CameraSize: 0.08
Viewer.CameraLineWidth: 3
Viewer.ViewpointX: 0
Viewer.ViewpointY: -0.7
Viewer.ViewpointZ: -1.8
Viewer.ViewpointF: 500
D: 1x5 matrix: Distortion coefficients k1 k2 p1 p2 k3
,K: 3x3 matrix: Intrinsic parameters [fx, 0, cx; 0, fy, cy; 0, 0, 1]
,R: 3x3 matrix: Probably rectification matrix(same as ROS)
,P: 3x4 matrix: Final projection matrix, this should be able to calculated by using previous matrices if your framework is performing that.
– unlut