2
votes

I am sorry if this is a silly question, but I have wondered for a long time why there are so many example vertex shaders out there, containing a modelview matrix. In my program I have the following situation:

  • projection matrix hardly ever changes (e.g. on resize of app window) and it is separate, which is fine,
  • model matrix changes often (e.g. transforms on the model),
  • view matrix changes fairly often as well (e.g. changing direction of viewing, moving around, ...).

If I were to use a modelview matrix in the vertex shader, I'd have to perform a matrix multiplication on the CPU and upload a single matrix. The alternative is uploading both model and view matrices and doing the multiplication on the GPU. The point is, that the view matrix does not necessarily change at the same time as the model matrix, but if one uses a modelview matrix, one has to perform the CPU multiplication and the upload, whenever either of them changes. Why not then use separate view and model matrices for a fast GPU multiplication and probably approximately the same number of GPU matrix uploads?

2
I've done this in webgl when i wanted to reduce the amount of matrix math in javascript, so there are a few cases it can make sense.Grimmy

2 Answers

5
votes

Because having the matrices multiplied in the vertex shader makes the GPU do the full computation for each and every vertex that goes into it (note that recent GLSL compilers will detect the product to be uniform over all vertices and may move the calculation off the GPU to the CPU).

Also when it comes to performing a single 4×4 matrix computation a CPU actually outperforms a GPU, because there's no data transfer and command queue overhead.

The general rule for GPU computing is: If it's uniform over all vertices, and you can easily precompute it on the CPU, do it on the CPU.

2
votes

Because you only need to calculate the MV matrix once per model. If you upload the two separately to the GPU, it will do that calculation for every vertex.

Now it may be that if you are CPU bound then it is still a performance gain, as even though you are adding (potentially) 1000s of additional matrix multiplications, you are pushing them off the CPU, but I'd consider that an optimization rather than a standard technique.