I have several queries for which I need some clarifications.. Sorry if they seem to be very basic.
When we start the kernel using clEnqueueNDRangeKernel, what actually happens to the host code, does it wait for the kernel to complete or ..?
Say we have multiple kernels, what happens in this case? does it happen that if one of the kernel has completed, can the host retrieve results from that kernel while others are still doing the computations.
I was reading the OpenCL spec for clCreateBuffer (link here). Check the description of the flag CL_MEM_USE_HOST_PTR. For your convenience I have posted it here: "it indicates that the application wants the OpenCL implementation to use memory referenced by host_ptr as the storage bits for the memory object."
I am not able to get what exactly they mean by "application" and "opencl implementation". Further it also says that "OpenCL implementations are allowed to cache the buffer contents."