Is glmultmatrixf hardware accelerated?

Question

I've been running some tests for a contract I'm doing to improve a very old opengl application and I've been surprised to find that in 10 of the 12 computers I tried calls to glloadmatrix and calls to glmultmatrixf have almost identical speeds.

test1:
- init: nothing
- for scene: call glloadmatrixf
- for each model: glpushmatrix, gltranslate/glrotate/glscale, gldrawelements, glpopmatrix

test2:
- init: precalculate each model's private mult matrix
- for scene: call glloadmatrixf
- for each model: glpushmatrix, glmultmatrixf, gldrawelements, glpopmatrix

test3:
- init: precalculate each model's full matrix
- for scene: nothing
- for each model: call glloadmatrixf, then call gldrawelements

I'm well aware that gltranslate/glrotate/glscale are never hardware accelerated, it's written very plainly in the opengl faq, but i though glmultmatrixf wasn't either. However on most computers test case 2 and 3 described above with hundreds of models both give almost exactly the same performance (difference possibly due to the added push/pop matrix), while test case 1 is significantly slower as expected.

So question: I can't seem to find any source on the internet that says if glmultmatrix is generally hardware accelerated or not. Anyone know?

ps: upgrading this old application to newer opengl standard is outside the scope of this contract

it's not worth accelerating because just the upload of the 32 floats to GPU would take too long — ratchet freak

ratchet freak ratchet freak · Accepted Answer · 2014-11-19T21:56:57

what you are seeing is that draw elements calls in test2 and test3 will be the bottleneck over the matrix manipulations of test1.

Doing a just matrix multiplication is actually pretty cheap (a few dozen multiplications and additions), the biggest cost with test1 will be the glRotate which requires getting the cosine and sine of the angle you want to rotate with.

Is glmultmatrixf hardware accelerated?

2 Answers