After reading this article I wanted to try to do the same, but to speed things up the rendering part I've wanted to be performed on the GPU, needless to say why the triangles or any other geometric objects should be rendered on GPU rather than CPU.
Here's one nice image of the process:
The task:
- Render 'set of vertices'
- Estimate the difference pixel by pixel between the rendered 'set of vertices' and the Mona Lisa image (Mona Lisa is located on GPU in texture or PBO no big difference)
The problem:
When using OpenCL or Cuda with OpenGL FBO (Frame Buffer Object) extension.
In this case according to our task
- Render 'set of vertices' (handled by OpenGL)
- Estimate the difference pixel by pixel between the rendered 'set of vertices' and the Mona Lisa image (handled by OpenCL or Cuda)
So in this case I'm forced to do copies from FBO to PBO (Pixel Buffer Object) to get rendered 'set of vertices' available for OpenCL/Cuda. I know how fast are Device to Device memory copies but according to the fact that I need to do thousands of these copies it makes sense not to do so...
This problem leaves three choices:
- Render with OpenGL to PBO (somehow, I don't know how, It also might be impossible to do so)
- Render the image and estimate the difference between images totally with OpenGL (somehow, I don't know how, maybe by using shaders, the only problem is that I've never written a shader in my life and this might take months of work for me...)
- Render the image and estimate the difference between images totally with OpenCL/Cuda (I know how to do this, but also it will take months to get stable and more or less optimized version of renderer implemented in OpenCL or Cuda)
The question
Can anybody help me with writing a shader for the above process or maybe point-out the way of rendering the Mona Lisa to PBO without copies from FBO...