Is there any common wisdom about how much matrix math should be done on the CPU vs the GPU for common 3D operations?
A typical 3D shader potentially needs several matrices. A world matrix for computing surface to light calculations. A world inverse transpose matrix for normal calculations. A world view projection matrix for 3d projection. Etc.
There are 2 basic ways to approach this.
Calculate the matrices on the CPU and upload the computed matrices to the GPU
In some CPU language
worldViewProjection = world * view * projection worldInverseTranspose = transpose(inverse(world)); upload world, worldViewProjection, worldInverseProjection to GPU
on GPU use
world,
worldViewProjection,worldInverseProjection
where needed.Pass the various component matrices to the GPU (world, view, projection) and compute the needed matrices on the GPU
In some CPU language
upload world, view, projection to GPU
On GPU
worldViewProjection = world * view * projection worldInverseTranspose = transpose(inverse(world));
I understand that at some level I probably just have to profile on the different machines and GPUs and that drawing a million vertices in 1 draw call might have different needs than drawing 4 vertices in 1 draw call but ... I'm wondering ...
Is there any common wisdom about when to do math on the GPU vs CPU for matrix calculations.
Another way to ask this question is what should my default be #1 or #2 above after which later I can profile for those cases where the default is not the best performance.