9
votes

For my CUDA development, I am using a machine with 16 cores, and 1 GTX 580 GPU with 16 SMs. For the work that I am doing, I plan to launch 16 host threads (1 on each core), and 1 kernel launch per thread, each with 1 block and 1024 threads. My goal is to run 16 kernels in parallel on 16 SMs. Is this possible/feasible?

I have tried to read as much as possible about independent contexts, but there does not seem to be too much information available. As I understand it, each host thread can have its own GPU context. But, I am not sure whether the kernels will run in parallel if I use independent contexts.

I can read all the data from all 16 host threads into one giant structure and pass it to GPU to launch one kernel. However, it will be too much copying and it will slow down the application.

2
Multiple contexts cannot simultaneously use a single GPU, so no, this won't work.talonmies
Thanks. Can you please put the above as an answer so that I can accept it?gmemon
@gmemon, See my comment below for creating and executing multiple contexts in CUDA 5.5. Did you successfully executed 16 kernels in 16SMs? What was your solution finally.Tariq

2 Answers

4
votes

While a multi-threaded application can hold multiple CUDA contexts simultaneously on the same GPU, those contexts cannot perform operations concurrently. When active, each context has sole use of the GPU, and must yield before another context (which could include operations with a rendering API or a display manager) can have access to the GPU.

So in a word, no this strategy can't work with any current CUDA versions or hardware.

6
votes

You can only have one context on a GPU at a time. One way to achieve the sort of parallelism you require would be to use CUDA streams. You can create 16 streams inside the context, and launch memcopies and kernels into streams by name. You can read more in a quick webinar on using streams at : http://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf. The full API reference is in the CUDA toolkit manuals. The CUDA 4.2 manual is available at http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_Toolkit_Reference_Manual.pdf.