0
votes

Referring to that question:

There are several ways to improve rendering speed for huge meshes. I tried the following implementations:

  1. Just render the mesh without any optimization or quantization.
  2. I decided to quantize my mesh as a preprocessing step on the CPU and switch the LOD-level (= quantization-level) on runtime. I submit the whole vertex-data and I render with Drawcall(numberOfNotDegeneratedIndices). -> faster than (0)
  3. My idea now: Do the whole quantization in the Vertex-Shader (all vertex-data is present for calculations and dynamic LOD-switching). Triangle degeneration should automatically happen after the vertex processing step. Drawcall(numberOfAllIndices) -> not really faster than (0)

Methodes compared: The amount of vertex-data submitted is always the same. VS calls: (0) == (2) > (1)

So I was wondering why method (2) doesn't get any faster than (0) despite quantization and resulting triangle degeneration?

I would like to get more information why it behaves likes this and where the bottlenecks on the GPU could be.

1
If your triangles don't touch pixel centres, you're already not paying a pixel shading cost or anything downstream of that. Whatever your bottleneck is, it's not pixel shading; it could be vertex shading, it could be the cost of fetching the vertex shader inputs, it could be something upstream; in any case, making your vertex shading more expensive is not going to help. Reducing the number of verts the GPU sees at all, as in your (1), will.moonshadow
When you say "quantize", what exactly are you doing?Nicol Bolas
quantization: map position to [-1;1], mutiply with 2^level, round(), divide by 2^levelMr. X

1 Answers

1
votes

I hate to bring up the obvious, but have you tried resizing your framebuffer to something absurd like 1x1 and confirming that the bottleneck is in-fact vertex processing?

Given no screenshot or anything to go by, I have to guess what the "huge" mesh you are trying to render looks like; I can think of a lot of scenarios where a huge mesh leads to massive overdraw, at which point you could actually be fillrate bound and using different LODs would make very little difference.

As funny as it sounds, you can also run into rasterization performance bottlenecks if you draw a lot of (potentially invisible) subpixel triangles. Many will never survive for shading because they do not satisfy coverage rules, but if you get enough tiny primitives all the way into the rasterization stage you do pay a penalty unrelated to vertex processing. LODs work great for solving that problem, so it is unlikely to be your problem here.