I need to render meshes that can be very dense. The vertex data is completely static. Apparently large VBOs are not great for performance (source: http://www.opengl.org/wiki/Vertex_Specification_Best_Practices#Size_of_a_VBO.2FIBO, although unfortunately it fails to link to its own source.) And even if large buffers were OK for performance, the total size of my vertex data sometimes exceeds what I can successfully allocate with glBufferData().
So assuming that I need to break my mesh into smaller VBOs of a few MB each, which of these methods is recommended:
1) Allocate enough buffers at startup to hold all of the mesh data. Rendering is then as simple as binding each buffer one at a time and calling glDrawArrays().
2) Allocate a small fixed pool of buffers at startup. Rendering would require filling up a buffer with a block of triangles, calling glDrawArrays(), fill up another buffer with the next block, call glDrawArrays() again, and so on. So potentially a lot more CPU work.
3) Some other method I'm not thinking of.
Part of my question just comes down to how memory allocation with VBOs works — if I allocate enough small buffers at startup to hold all of the mesh data, am I going to run into the same memory limit that would prevent me from allocating a single buffer large enough to hold all of the data? Or are VBOs integrated with virtual memory such that OpenGL will handle swapping out VBOs when I exceed available graphics memory?
Finally, how much of this is actually implementation-dependent? Are there useful references from AMD/Intel/NVidia that explain best practices for buffer managment?