0
votes

I am running my OpenCL C codes on our institution's GPU cluster, which has 8 nodes and each node has an Intel Xeon 8C proc with 3 NVIDIA Tesla M2070 GPUs (in total 24 GPUs). I need to find a way from my host code to identify which of the GPUs are already occupied and which are free and to submit my jobs to those available GPUs. The closest answer that i could find was in

How to programmatically discover specific GPU on platform with multiple GPUs (OpenCL 1.1)?

How to match OpenCL devices with a specific GPU given PCI vendor, device and bus IDs in a multi-GPU system?.

Can anyone help me out with how to choose a node and choose a GPU which is free for computation. I am writing in OpenCL C.

Gerald

2
Each node should be able to manage its gpus. You are able to know when a gpu is free at the node level, and then you can send a message to a dispatcher or poll for new work to be done.mfa

2 Answers

2
votes

Unfortunately, there is no standard way to do such a thing.

If you want to squeeze the full power of GPUs for computations and your problem is not a memory hog, I can suggest to use two contexts per device: as kernels at the first one end computation, kernels of the second one are still working and you have time to fill the buffers with data and start next task in the first context, and vice versa. In my case (AMD GPU, OpenCL 1.2) if saves from 0 to 20 % of computational time. Three contexts provide sometimes slower execution, sometimes faster, so I do not recommend this as a standard technique, but you can try. Four and more contexts are useless, from my experience.

0
votes

Have a command queue for each device, then use OpenCL Events with each kernel submission, and check the state of them before submitting a new kernel for execution. Whichever command queue has the least unfinished kernels is the one you should enqueue to.