Does CU_CTX_SCHED_BLOCKING_SYNC make kernels synchronous?

Question

Does creating a CUDA context with CU_CTX_SCHED_BLOCKING_SYNC make CUDA kernel launches actually synchronous (i.e. stalling the CPU thread as a normal CPU same-thread function would)?

Documentation only states

CU_CTX_SCHED_BLOCKING_SYNC: Instruct CUDA to block the CPU thread on a synchronization primitive when waiting for the GPU to finish work.

but I'm not sure I understood it right.

talonmies talonmies · Accepted Answer · 2018-05-24T07:32:54

No.

These flags control how the host thread will behave when a host<->device synchronization API like cuCtxSynchronize , cuEventSynchronize, or cuStreamSynchonize are called using the host API. Other non-blocking API calls are asynchronous in both cases.

There are two models of host behaviour, blocking or yielding. Blocking means the calling host thread will spin while waiting for the call to return and block access to the driver by other threads, yield means it can yield to other host threads trying to interact with the GPU driver.

If you want to enforce blocking behaviour on kernel launch, use the CUDA_LAUNCH_BLOCKING environment variable.

Does CU_CTX_SCHED_BLOCKING_SYNC make kernels synchronous?

1 Answers