I have a WPF application that acquires images from a camera, processes these images, and displays them. The processing part has become burdensome for the CPU, so I've looked at moving this processing to the GPU and running custom CUDA kernels against them. The basic process is as follows:
1) acquire image from camera 2) load image onto GPU 3) call CUDA kernel to process image 4) display processed image
A WPF-to-CUDA-to-Display Control strategy is what I'm trying to figure out. It seems natural that once the image is loaded onto the GPU that it would not have to be unloaded in order to be displayed. I've read that this can be done with OpenGL, but do I really need to learn OpenGL and include it in my project in order to do a fast display of a CUDA-processed image?
I understand (I think) the issues of calling CUDA kernels from C#. My plan is to either build an unmanaged library around my CUDA calls, which I later wrap for C# -- OR -- try to decide on which one of the managed wrappers (managedCUDA, Cudafy, etc.) to try. I worry about using one of the prebuilt wrappers because they all appear to be lightly supported...but maybe I have the wrong impression.
Anyway, I'm feeling a bit overwhelmed after days of researching the possible options. Any advice would be greatly appreciated.