How the Vertex Shaders of OpenGL 4.X process huge number of vertices

Question

In OpenGL 4.3+, the Compute Shader allow user to explicitly config the number of threads in each block and how many blocks are used to process the data (glDispatchCompute). However, in Vertex Shader, I do not need to provide any threads/blocks configuration. So for Vertex Shader, it there an automatic way to distribute the work load among blocks/processes? When I have large number of vertices to process, Is it possible that I explicitly provides configuration to Vertex Shader?

The driver/GPU itself already schedules most of the pipeline to work this way. It would be horribly inefficient if it had to transform vertices in serial using a single warp/wavefront (thread scheduling unit). Since vertex/fragment shaders cannot read the results of adjacent vertices/fragments, they are easily scheduled in parallel. The scheduling you explicitly consider in compute shaders already happens implicitly in the normal render pipeline. Load-balancing has been a major part of GPU design since the unified shader model; you have to do exotic (GL4 era) things in a shader to mess with it — Andon M. Coleman
Thanks for your comments :) In this case, does that mean the workload distribution mechanism may various among different drivers? Is it possible to figures out any "patterns" of this distribution, in case I want to make some data optimization to improve the performance? (e.g. re-order the vertices to reduce the cache miss-rate) — UNCAL LEE
To figure out patterns, you could probably use 'atomic counters' just like this code, which figures out rasterization patterns : geeks3d.com/20120309/… — rotoglup

genpfault genpfault · Accepted Answer · 2013-11-15T22:37:03

Is it possible that I explicitly provides configuration to Vertex Shader?

No.

So for Vertex Shader, it there an automatic way to distribute the work load among blocks/processes?

Yes. The GPU/driver should already be taking care of that behind the scenes.

By using large batches in server-side memory you're already telling the OpenGL implementation to render those as fast as it can.

It's not like OpenGL starts up in some sort of "slow" mode that you have to turn off.

How the Vertex Shaders of OpenGL 4.X process huge number of vertices

1 Answers