Types of Cuda & OpenGL interop

Question

I'm working on a 3D simulation (2D for now), I'm using CUDA for the computation and OpenGL for the rendering. My questions concerns the interoperability between CUDA and OpenGL. As far as I'm seeing it there are two general approaches to do this:

Number one would be to use a Pixel Buffer Object (or Vertex Buffer Object) which is mapped to the CUDA Global Memory and afterwards copied to a OpenGL texture. This is said to be a very fast approach.

Number two would be to copy a texture Object directly to the CUDA Texture Memory. Which would also be very nice since then I could use all the Texture Memory features like Texture Caching and so on.

Now, could someone explain to me what the general differences between those two approaches are? And for what kind of cases each of them are normally used?

ds-bos-msk ds-bos-msk · Accepted Answer · 2013-05-07T12:12:38

The entire difference is like you said, in texture caching. So choosing between either of these methods depends on whether or not you will want to exploit texture caching in your visualization. Lets look at some common cases:

a) If you want to calculate how a surface is deformed (for example the surface of water, or maybe some elastic deformation) and you need to know the new vertices for the surface's polygonal mesh then you would typically use Buffers (Number one), except you wouldn't need to copy to an OpenGL texture here. In fact there would be no copying involved here, you would just reference the buffer in CUDA and use it as a gl buffer.

b) If you have a particle simulation and need to know the updated particle's position, you would also just use a buffer as in the above case.

c) If you have a finite element grid simulation, where each fixed volume cell in space would gain a new value and you need to visualize it via volume rendering or isosurface, here you would want to use a texture object (2D or 3D depending on the dimensionality of your simulation), because when you're casting rays, or even generating streamlines you will almost always be needing to immediately have the neighboring texels cached. And you can avoid doing any copying here, as in the above method you could also directly reference some CUDA texture memory (cudaArray) from OpenGL. You would use these calls to do that:

cudaGraphicsGLRegisterImage( &myCudaGraphicsResource, textureNameFromGL, GL_TEXTURE_3D,...)
...
cudaGraphicsMapResources(1, myCudaGraphicsResource)
cudaGraphicsSubResourceGetMappedArray(&mycudaArray, myCudaGraphicsResource, 0, 0)
cudaGraphicsUnMapResources(1, myCudaGraphicsResource)

and so this texture data in CUDA can be referenced by mycudaArray while this same memory in OpenGL can referenced by textureNameFromGL.

Copying from a buffer into a texture is a bad idea because if you need this data for texture caching you will be doing the additional copy. This was more popular in the earlier versions of CUDA before texture interop was supported

You could also use textures in the a) and b) use cases as well. With some hardware it might even work faster, but this is very hardware dependent. Keep in mind also that texture reading also applies minification and magnification filters which is extra work if all you're looking for is an exact value stored in a buffer.

For sharing textures resources with CUDA and OpenGL please refer to this sample

https://github.com/nvpro-samples/gl_cuda_interop_pingpong_st

Types of Cuda & OpenGL interop

1 Answers