1
votes

I am looking for a way to partition my Nvidia GPU device, so that I can run two sets of kernels concurrently without them fighting for SMs.

According to documentation, in openCL you can use clCreateSubDevices. Is there any CUDA equivalent?

1
TTBOMK CUDA does not support device fission à la OpenCL. You can however run multiple kernels in parallel, and the scheduler may run them simultaneously, depending on availability of resources / scheduler mood. But it is not guaranteed. - user703016

1 Answers

1
votes

I personally haven't come across such a feature in CUDA.

To run two kernels concurrently, you can calculate the occupancy of your kernels, accordingly call limited number of blocks, and use a loop inside kernels to imitate the existence of more blocks. It would probably cost you a few registers more per thread. If you don't want to touch the content of your kernels, you can launch each kernel inside a stream multiple times, each time with limited grid size. The cost of the second approach is probably not-fully-occupied SMs when transitioning between kernels of one stream.