5
votes

A projection matrix projects a vector from a higher dimensional space onto a subspace. I would have expected the projection matrix in OpenGL to project a point in R3 onto a 2 dimensional plane. This seems to be supported by a lot of literature on the internet. Many sites imply that the projection matrix projects the 3D world onto a plane and this is what is drawn. However I get the feeling that most of these explanations are skipping several steps. Many of them seem to contradict each other so I'd like some clarification of the conclusions I have drawn from my own analysis.

Can someone please confirm (or correct if wrong) that:

  1. The projection transformation in OpenGL is not actually a projection matrix, but rather transforms a point into clip space (which is still part of the R3 domain) and the actual projection onto a 2D plane happens later as a fixed function of the pipeline.
  2. The projection matrix doesn't apply the perspective divide; however it does need to set the w coordinate so that when the perspective divide happens later (as a fixed function of the pipeline) points are correctly placed either inside or outside of NDC.
  3. Clip space is a box between (-1,+1) on the x,y axis, and (n,f) on the z-axis while NDC is a box betwen (-1,+1) on all axis.

I analysed the following projection matrix to come to the above conclusions:

[ 2n/(r-l)     0     (r+l)/(r-b)      0     ]
[    0     2n/(t-b)  (t+b)/(t-b)      0     ]
[    0         0    -(f+n)/(f-n) -2fn/(f-n) ]
[    0         0         -1           0     ]

From that analysis I concluded that any point that is within the frustum will be within the clip boundaries along the x,y axis; it may be outside the boundaries along the z axis, however once the perspective divide happens (with w now being the old -z) the point will fully inside clip space.

From this I have also concluded that for a point to be visible after the MVP transformation it's x,y and z/w coordinates must be between +/-1, and that the perspective divide and actual projection happen after the vertex shader.

If applicable answers specific to modern OpenGL (3.3 core or later) only please.

2

2 Answers

4
votes
  1. The projection matrix in OpenGL transforms points into clip space. But this is already a projection. The only thing that has to be done after the matrix multiplication is the perspective divide.

  2. True

  3. Clip space is the space from [-w to w] on each axis, since the only operation that happens between clip space and NDC is the perspective divide. NDC is from [-1 to 1] on each axis.

Additional notes:

  • Mathematically, an OpenGL projection matrix maps a 4D space (P^4) into another 4D space (clip space). This can easily be seen by the form of a matrix (4x4 matrix maps 4D -> 4D). With the perspective divide the 4D clip space is truncated by homogenization into the 3D NDC (R^3) space.
  • A point is visible after the projection, when it's x,y,z coordinates are between [-w, w]. The reason why clipping happens before the perspective divide is, that NDC is not necessarily a cubic space (it is one in OpenGL, but in DirectX, e.g., NDC is x,y in [-1,1], z in [0,1])
  • A geometric projection in general is defined as a mapping p from one space (O) into another one (T). this would be written as

    O --p--> T

    In some cases such a mapping can be described by a tranformation matrix in euclidean space (a parallel projection, for example, would work), but in a lot of cases this is not possible (especially in cases where parallel lines in O are not parallel any more in T). This is why projective spaces are required.

I better stop here now since it gets more and more complicated from the mathematical point of view, but if you want to dig more into this topic, I suggest the following articles:

Wikipedia Projective Space
Wikipedia Projective Geometry
Video about projection in general (this, and the next one)

0
votes

I guess this depends a lot on how exactrly define "projection". Wikipedia introduces mathematical projections as follows:

In mathematics, a projection is a mapping of a set (or other mathematical structure) into a subset (or sub-structure), which is equal to its square for mapping composition (or, in other words, which is idempotent).

So in other words, applying a projection twice will not change the result further. It is easy to see that if one projects a 3D point to any 2D plane embedded in that space, this property is fullfilled.

However, the typical "projection" matrices used for rendering in 3D graphics do not fulfill that criteria. The "projection" term is used a bit more loose. We actually do not want to project 3D points to a 2D subspace, where information is lost. We want to keep the depth information in screen space, for example, to be able to do depth testing. So conceptually, even after the "perspective divide", we still have a 3D space. And GL's window space is expilicitely defined as 3-dimensional, with a window space z value. Only x and y are used to address the pixels in the color buffer, of course, but every generated fragment has it's z value.

The term I have heard to distinct such kind of operations from the strict mathematical projections as described above is "perspective transformation", which probably makes much more sense from a mathematical point of view. The nice thing with these is that they are invertible (to a certain degree/ there are ambiguities involved because of the perspecitive divide which maps objects behind the camera mirrored in front of it, but these lie outside of the viewing frustum and typically do not pose a problem).