2
votes

This question pops up in my mind everytime I use pinned memory in CUDA. I have already searched a lot on this topic but didnt find anything. Basically we have to perform two data transfers in order to use pinned memory:

Step1 -> Pageable memory to pinned memory

Step2 -> Pinned memory to device memory

I can also directly initialize the pinned memory by the input data and transfer it to the GPU, it will save my transfer time of Step1. In my case I am processing very large amount of input data on GPU. And too much page locked memory could decrease your overall system performance. So I cannot just make the whole allocation on pinned memory. I have to iteratively make transfers of Step1 and Step2 (above).

Is there any provision in CUDA to convert your existing host memory(pageable) into pinned memory? Like follows :

Step 1 -> Initialize the pageable memory by input data

Step 2 -> Convert the above memory to Pinned memory

Step 3 -> Transfer to device and perform execution

I hope what I am asking make sense.

1

1 Answers

7
votes

Yes you can.

The runtime API includes cudaHostRegister which allows an existing pageable memory allocation to registered with the CUDA context. This can include pinning the memory, mapping into the virtual address space, or both. Your CUDA context must have been created with the cudaMapHost flag (which is default if the context is created in the runtime API), and there are potentially some alignment requirements the memory must satisfy, depending on the driver version and platform you are using. But it can, in principle, be done.