Better to create a large number of static VBOs for static data up front, or stream data into the VBOs?

Question

I need to render meshes that can be very dense. The vertex data is completely static. Apparently large VBOs are not great for performance (source: http://www.opengl.org/wiki/Vertex_Specification_Best_Practices#Size_of_a_VBO.2FIBO, although unfortunately it fails to link to its own source.) And even if large buffers were OK for performance, the total size of my vertex data sometimes exceeds what I can successfully allocate with glBufferData().

So assuming that I need to break my mesh into smaller VBOs of a few MB each, which of these methods is recommended:

1) Allocate enough buffers at startup to hold all of the mesh data. Rendering is then as simple as binding each buffer one at a time and calling glDrawArrays().

2) Allocate a small fixed pool of buffers at startup. Rendering would require filling up a buffer with a block of triangles, calling glDrawArrays(), fill up another buffer with the next block, call glDrawArrays() again, and so on. So potentially a lot more CPU work.

3) Some other method I'm not thinking of.

Part of my question just comes down to how memory allocation with VBOs works — if I allocate enough small buffers at startup to hold all of the mesh data, am I going to run into the same memory limit that would prevent me from allocating a single buffer large enough to hold all of the data? Or are VBOs integrated with virtual memory such that OpenGL will handle swapping out VBOs when I exceed available graphics memory?

Finally, how much of this is actually implementation-dependent? Are there useful references from AMD/Intel/NVidia that explain best practices for buffer managment?

If you are targeting mobile devices you can also take a look at answers for this question - stackoverflow.com/questions/13060299/… — keaukraine

Steven Hansen Steven Hansen · Accepted Answer · 2014-02-17T03:31:51

I suggest you continue to allocate as few static buffers with as many vertices each as possible. If performance begins to suffer, you need to scale down the quality level (ie number of vertices) of your model.

Actually, using too many SMALL buffers are bad for performance, because each render involves a certain amount of fixed overhead. The more vertices in a single buffer (within reason), the less number of times that initial setup needs done.

The reason large (as in very large) VBOs might not be good, is because video cards have limited performance power / memory capacity, and swapping large amounts of memory back and forth between the system RAM and video RAM can be intensive... NOT because another mechanism is better for rendering a very large model. I can't imagine why you are failing on glBufferData() unless your requirements are incredibly huge (vastly larger than 256 MB), or perhaps you are using the USHORT data type for indices, but more than 65000 allocated.

The 1MB - 4MB suggestion has to do with how you should optimize your models, and is really somewhat dependent on the generation of video card you are using. Most video cards support HUGE VBOs. I'm betting your problem is elsewhere.

Better to create a large number of static VBOs for static data up front, or stream data into the VBOs?

1 Answers