Confused about OpenGL transformations

1

votes

In opengl there is one world coordinate system with origin (0,0,0).

What confuses me is what all the transformations like glTranslate, glRotate, etc. do? Do they move objects in world coordinates, or do they move the camera? As you know, the same movement can be achieved by either moving objects or camera.

I am guessing that glTranslate, glRotate, change objects, and gluLookAt changes the camera?

opengl

Why guess? Read the documentation. - Lightness Races in Orbit

BTW, OpenGL is not specific to C++ (or even C), so I removed the C++ tag. - Viktor Latypov

0

votes

All transformations are transformations on objects. Even gluLookAt is just a transformation to transform the objects as if the camera was where you tell it to be. Technically they are transformations on the vertices, but that's just semantics.

7

votes

In opengl there is one world coordinate system with origin (0,0,0).

Well, technically no.

What confuses me is what all the transformations like glTranslate, glRotate, etc. do? Do they move objects in world coordinates, or do they move the camera?

Neither. OpenGL doesn't know objects, OpenGL doesn't know a camera, OpenGL doesn't know a world. All that OpenGL cares about are primitives, points, lines or triangles, per vertex attributes, normalized device coordinates (NDC) and a viewport, to which the NDC are mapped to.

When you tell OpenGL to draw a primitive, each vertex is processed according to its attributes. The position is one of the attributes and usually a vector with 1 to 4 scalar elements within local "object" coordinate system. The task at hand is to somehow transform the local vertex position attribute into a position on the viewport. In modern OpenGL this happens within a small program, running on the GPU, called a vertex shader. The vertex shader may process the position in an arbitrary way. But the usual approach is by applying a number of nonsingular, linear transformations.

Such transformations can be expressed in terms of homogenous transformation matrices. For a 3 dimensional vector, the homogenous representation in a vector with 4 elements, where the 4th element is 1.

In computer graphics a 3-fold transformation pipeline has become sort of the standard way of doing things. First the object local coordinates are transformed into coordinates relative to the virtual "eye", hence into eye space. In OpenGL this transformation used to be called the modelview transformaion. With the vertex positions in eye space several calculations, like illumination can be expressed in a generalized way, hence those calculations happen in eye space. Next the eye space coordinates are tranformed into the so called clip space. This transformation maps some volume in eye space to a specific volume with certain boundaries, to which the geometry is clipped. Since this transformation effectively applies a projection, in OpenGL this used to be called the projection transformation.

After clip space the positions get "normalized" by their homogenous component, yielding normalized device coordinates, which are then plainly mapped to the viewport.

To recapitulate:

A vertex position is transformed from local to clip space by

vpos_eye  = MV · vpos_local
eyespace_calculations(vpos_eye);
vpos_clip =  P · vpos_eye

·: inner product column on row vector

Then to reach NDC

vpos_ndc = vpos_clip / vpos_clip.w

and finally to the viewport (NDC coordinates are in the range [-1, 1]

vpos_viewport = (vpos_ndc + (1,1,1,1)) * (viewport.width, viewport.height) / 2 + (viewport.x, viewport.y)

*: vector component wise multiplication

The OpenGL functions glRotate, glTranslate, glScale, glMatrixMode merely manipulate the transformation matrices. OpenGL used to have four transformation matrices:

modelview
projection
texture
color

On which of them the matrix manipulation functions act on can be set using glMatrixMode. Each of the matrix manipulating functions composes a new matrix by multiplying the transformation matrix they describe on top of the select matrix thereby replacing it. The functions glLoadIdentity replace the current matrix with identity, glLoadMatrix replaces it with a user defined matrix, and glMultMatrix multiplies a user defined matrix on top of it.

So how does the modelview matrix then emulate both object placement and a camera. Well, as you already stated

As you know, the same movement can be achieved by either moving objects or camera.

You can not really discern between them. The usual approach is by splitting the object local to eye transformation into two steps:

Object to world – OpenGL calls this the "model transform"
World to eye – OpenGL calls this the "view transform"

Together they form the model-view, in fixed function OpenGL described by the modelview matrix. Now since the order of transformations is

local to world, Model matrix vpos_world = M · vpos_local
world to eye, View matrix vpos_eye = V · vpos_world

we can substitute by

vpos_eye = V · ( M · vpos_local ) = V · M · vpos_local

replacing V · M by the ModelView matrix =: MV

vpos_eye = MV · vpos_local

Thus you can see that what's V and what's M of the compund matrix M is only determined by the order of operations in which you multiply onto the modelview matrix, and at which step you decide to "call it the model transform from here on".

I.e. right after a

glMatrixMode(GL_MODELVIEW);
glLoadIdentity();

the view is defined. But at some point you'll start applying model transformations and everything after is model.

Note that in modern OpenGL all the matrix manipulation functions have been removed. OpenGL's matrix stack never was feature complete and no serious application did actually use it. Most programs just glLoadMatrix-ed their self calculated matrices and didn't bother with the OpenGL built-in matrix maniupulation routines.

And ever since shaders were introduced, the whole OpenGL matrix stack got awkward to use, to say it nicely.

The verdict: If you plan on using OpenGL the modern way, don't bother with the built-in functions. But keep in mind what I wrote, because what your shaders do will be very similar to what OpenGL's fixed function pipeline did.

3

votes

OpenGL is a low-level API, there is no higher-level concepts like an "object" and a "camera" in the "scene", so there are only two matrix modes: MODELVIEW (a multiplication of "camera" matrix by the "object" transformation) and PROJECTION (the projective transformation from world-space to post-perspective space).

Distinction between "Model" and "View" (object and camera) matrices is up to you. glRotate/glTranslate functions just multiply the currently selected matrix by the given one (without even distinguishing between ModelView and Projection).

1

votes

Those functions multiply (transform) the current matrix set by glMatrixMode() so it depends on the matrix you're working on. OpenGL has 4 different types of matrices; GL_MODELVIEW, GL_PROJECTION, GL_TEXTURE, and GL_COLOR, any one of those functions can change any of those matrices. So, basically, you don't transform objects you just manipulate different matrices to "fake" that effect.

Note that glulookat() is just a convenient function equivalent to a translation followed by some rotations, there's nothing special about it.

0

votes

That's true, glTranslate, glRotate change the object coordinates before rendering and gluLookAt changes the camera coordinate.

Confused about OpenGL transformations

5 Answers