I have a couple of questions, related to OpenCL memory transfer functions. I faced many questions, related to this, but to none of them extended answers were given. Probably we can collect here the overall answer.
This is my current view on three current ways of moving data:
1) enqueueReadBuffer/enqueueWriteBuffer - these two functions always copy the content of the buffer, created on the host, to the device, and from the device. No pinned memory and no DMA mechanism are used here.
2) enqueueMigrateMemObjects - this is sometimes described as an alternative to enqueueRead/Write, but in this case, memory is copied exactly at the time of this function call. No pinned memory and no DMA mechanism are used here.
3) enqueueMapBuffer/enqueueUnmapBuffer - here always pinned memory and DMA mechanism are used.
This function uses two types of buffers: created with CL_MEM_USE_HOST_PTR flag or CL_MEM_ALLOC_HOST_PTR flag. With the first one, we map an array, created on the host, to the array, created on the device. With the second array is allocated on the device and maps it to the newly created array on the host.
This is what I can state according to the documentation. I ran several tests but only saw that migration function is faster than reading/writing. Regarding these three paragraphs I have some questions:
1) If these functions do only copying, then why here https://software.intel.com/en-us/forums/opencl/topic/509406 people talk about pinning/unpinning memory during reading/writing? Under which conditions do these functions use pinned memory? Or this is just the feature of intel implementation, where ALL memory transfer related functions use pinned memory and DMA?
Also, does it mean, that if I use pinned memory, then the DMA mechanism will work? And vice versa - if I want to have DMA working, I need pinned memory?
2) Is this migration function - exactly what happens inside enqueueRead/WriteBuffer functions without some additional overhead, which these enqueuRead/writeBuffer functions give? Does it always just copy or also does DMA transfer?
For some reasons, some sources when talking about DMA transfer, use "copy", "memory", "migration" word for transferring the data between two buffers ( on the host and on the device). However, there cannot be any copy, we just write directly to the buffer without any copy at all. How should I treat this write during DMA?
What will happen, if I will use enqueueMigrateMemOjects with buffers, created with flag CL_MEM_USE_HOST_PTR?
3) With these two functions, there is total confusion. How the mapping and reading/writing will happen, if I use: a) existing host pointer or b) newly allocated host pointer?
Also here I do not properly understand how the DMA works. If I mapped my buffer on the host side to the buffer on the device side, with the help of which functions the memory is transferred between them? Should I always unmap my buffer after?
There is no explanation anywhere for this, like:" When we create a new buffer with this flag and use this memory function transfer, the data is transferred this way and such features as... are used. If the memory is created as read-only, this happens, if the memory if write only - this".
Maybe there is already a good guide for this, but from the OpenCL specification, I cannot answer my questions.