4
votes

After reading this article I wanted to try to do the same, but to speed things up the rendering part I've wanted to be performed on the GPU, needless to say why the triangles or any other geometric objects should be rendered on GPU rather than CPU.

Here's one nice image of the process:

Mona Lisa

The task:

  1. Render 'set of vertices'
  2. Estimate the difference pixel by pixel between the rendered 'set of vertices' and the Mona Lisa image (Mona Lisa is located on GPU in texture or PBO no big difference)

The problem:

When using OpenCL or Cuda with OpenGL FBO (Frame Buffer Object) extension.

In this case according to our task

  1. Render 'set of vertices' (handled by OpenGL)
  2. Estimate the difference pixel by pixel between the rendered 'set of vertices' and the Mona Lisa image (handled by OpenCL or Cuda)

So in this case I'm forced to do copies from FBO to PBO (Pixel Buffer Object) to get rendered 'set of vertices' available for OpenCL/Cuda. I know how fast are Device to Device memory copies but according to the fact that I need to do thousands of these copies it makes sense not to do so...

This problem leaves three choices:

  1. Render with OpenGL to PBO (somehow, I don't know how, It also might be impossible to do so)
  2. Render the image and estimate the difference between images totally with OpenGL (somehow, I don't know how, maybe by using shaders, the only problem is that I've never written a shader in my life and this might take months of work for me...)
  3. Render the image and estimate the difference between images totally with OpenCL/Cuda (I know how to do this, but also it will take months to get stable and more or less optimized version of renderer implemented in OpenCL or Cuda)

The question

Can anybody help me with writing a shader for the above process or maybe point-out the way of rendering the Mona Lisa to PBO without copies from FBO...

1
What language are you using ? If you're using the original C# version, then I found a massive speed improvement by changing the comparison functions to use the unsafe keyword. this give me nearly double performance.Russ Clarke
The language is C#, I'm using my own implementation, I need to optimize the Device part, not the C# part...Lu4
Can't you just read from an OpenGL texture in CUDA or OpenCL? I thought they support 2d images, too, which you can directly connect to OpenGL textures, but I'm ready to be convinced of the opposite. This would free you from the copy, as FBOs can directly render into a texture.Christian Rau
Yes I can read from texture directly with Cuda and OpenCL, but how do I render directly to texture?Lu4

1 Answers

1
votes

My gut feeling is that the Shader approach is also going to have the same IO problem, you certainly can compare textures in a shader as long as the GPU supports PS 4.0 or higher; but you've still got to get the source texture (Mona Lisa) on to the device in the first place.

Edit: Been digging around a bit and this forum post might provide some insight:

http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=221384&page=1.

The poster, Komat, provides an example of the shader on the 2nd page.