0
votes

For stereo cameras on the market, two cameras are always mounted side by side and with a displacement that is perpendicular to the cameras’ optical axes. I take this setup for granted. One idea came to my mind whether this is necessary? If two cameras are not parallel and have different focal length, camera calibration can correct the difference. Why are two cameras mounted in parallel? My guess is that two cameras can have a large overlapping region. Am I correct?

Edit: In the book Learning OpenCV, the function cvStereoRectify have one argument named flags. The book gives some explanation on the flag. enter image description here

2

2 Answers

1
votes

"Why are two cameras mounted in parallel?".

Two cameras are mounted horizontal to each other, and spaced at the 'average human interocular distance' (Pupillary distance (PD) or interpupillary distance (IPD)) of 61 to 64mm, to take two images that can be fed separately to each eye and be processed by the brain to reproduce the same view as if the scene were viewed by a human rather than two cameras.

Spacing the cameras further apart (Giant's vision) gives you a better sense of depth at great distances and loss of stereoscopic vision at close distances due to the object only being fully visible with one eye.

Similarly spacing the cameras very close together (Bug's vision) gives you no sense of depth at great distances but perfectly good stereoscopic vision at very close distances.

The human brain does not process images as easily when they are far outside of the individual's brain's learned eye spacing so two images that are incorrectly placed can cause eye strain and headaches in some people.

Three or more cameras are better (with appropriate additional Software to process the Data into something that can be presented to two eyes).

Three cameras side by side is relatively inexpensive and doesn't require complex processing if you simply swap the video between two of the closer spaced cameras to the outer two widest spaced cameras, allowing less eyestrain when switching between closeup and more distant shots.

Work is being done using 4 cameras and image processing to provide a better 3D Image with improved depth resolution: https://blog.elphel.com/2017/09/long-range-multi-view-stereo-camera-with-4-sensors/comment-page-1 .

Two cameras are usually used because it's the least expensive and fastest method, using the brain to do the calculations and the resulting (expected) human shortcomings, to simply record simultaneously what each (average IPD) eye would see. It's not the best method or result but it's approximately equal to the average person's eyesight, thus good enough for most (but not all) people.

Note that it is also possible to use a single camera with a rotating aperture that tricks the brain into accepting a single image viewed with one eye as an image possessing depth. See: Patent US 20040155975 A1 and Vision III Imaging.

See an explanation of Parallax Scanning.

Cover one eye and watch these videos:

https://www.youtube.com/watch?v=nlh-RYpnIMo

https://www.youtube.com/watch?v=IpbuX1KX2G0

1
votes

Two cameras side by side seems a logical crossover from human vision. Replicating our two eyes would give the best result. The limitation with this approach is how close you can get the lenses to each other.

That being said there are some other stereo camera configurations. James Cameron (director of Avatar) explains it nicely. Avatar's Cameron-Pace 3D Camera Rig

Edit: At the time Avatar was conceived (~1994) the ToF and Structured light tech was not available.