3
votes

As said here: https://developer.nvidia.com/gpudirect I can access from GPU0-Core to the GPU1-RAM on the Same PCIe Bus:

  • Load/Store
  • cudaMemcpy()

It named as "NVIDIA GPUDirect v2 Peer-to-Peer (P2P) Communication Between GPUs on the Same PCIe Bus (2011)": enter image description here

And I can use RDMA via Infiniband to copy from GPU2-RAM to GPU1-RAM on the different PCIe Bus, it is named as "GPUDirectâ„¢ Support for RDMA, Introduced with CUDA 5 (2012)": enter image description here

But can I use in RDMA on the different PCIe Bus connected via Infiniband both:

  • Load/Store (access from GPU2-Core to GPU-RAM)
  • cudaMemcpy() (to copy from GPU2-RAM to GPU1-RAM)

Or can I use only cudaMemcpy() in RDMA?

1

1 Answers

3
votes

GPUDirect RDMA has a single public implementation at this time, which is via Mellanox Infiniband. You would need to use something like one of the CUDA-aware MPI systems to take advantage of it.

You cannot use cudaMemcpy to copy from GPU1 to GPU2 in the example you have shown, i.e. you cannot use cudaMemcpy to copy directly between GPUs that live in different host systems.