A picture from a camera is just a projection of a bunch of color samples onto a plane. Assuming that the camera itself creates pictures with square pixels, the possible position of a given pixel is a vector from the camera's origin through the plane the pixel was projected onto. We'll refer to that plane as the picture plane.
One sample doesn't give you that much information. Two samples tells you a little bit more - the position of the camera relative to the plane created by three points: the two sample points and the camera position. And a third sample tells you the relative position of the camera in the world; this will be a single point in space.
If you take the same three samples and find them in another picture taken from a different camera, you will be able to determine the relative position of the cameras from the three samples (and their orientations based on the right and up vectors of the picture plane). To the correct distance, you need to know the distance between the actual sample points. In the case of a checkerboard, it's the physical dimensions of the checkerboard.