Concurrent writes in the same global memory location

Question

I have several blocks, each having some integers in a shared memory array of size 512. How can I check if the array in every block contains a zero as an element?

What I am doing is creating an array that resides in the global memory. The size of this array depends on the number of blocks, and it is initialized to 0. Hence every block writes to a[blockid] = 1 if the shared memory array contains a zero.

My problem is when I have several threads in a single block writing at the same time. That is, if the array in the shared memory contains more than one zero, then several threads will write a[blockid] = 1. Would this generate any problem?

In other words, would it be a problem if 2 threads write the exact same value to the exact same array element in global memory?

Tom Tom · Accepted Answer · 2012-03-06T16:29:38

For a CUDA program, if multiple threads in a warp write to the same location then the location will be updated but it is undefined how many times the location is updated (i.e. how many actual writes occur in series) and it is undefined which thread will write last (i.e. which thread will win the race).

For devices of compute capability 2.x, if multiple threads in a warp write to the same address then only one thread will actually perform the write, which thread is undefined.

From the CUDA C Programming Guide section F.4.2:

If a non-atomic instruction executed by a warp writes to the same location in global memory for more than one of the threads of the warp, only one thread performs a write and which thread does it is undefined.

See also section 4.1 of the guide for more info.

In other words, if all threads writing to a given location write the same value, then it is safe.

Concurrent writes in the same global memory location

4 Answers