0
votes

I have a code from CUDA example to atomicAdd a single variable

__global__ void myadd(int *data)
{
  unsigned int x = blockIdx.x;
  unsigned int y = threadIdx.x;
  if ( (x%2==0) && (y%2==1) ) atomicAdd(data,1);
}

int main(void)
{
  int *Hdt;
  Hdt = (int*)malloc(sizeof(int));
  // ... CUDA initialization here
  myadd<<<20, 10>>>(Hdt);
  cudaFree(Hdt);
}

It works good for me. But I am expanding my code so I would like to pass an array instead of a number to the kernel

__global__ void myadd(int *data)
{
  unsigned int x = blockIdx.x;
  unsigned int y = threadIdx.x;
  unsigned int z = threadIdx.y;
  if ( (x%2==0) && (y%2==1) && (z>4) ) atomicAdd(data[z],1);
}

int main(void)
{
  int *Hdt;
  Hdt = (int*)malloc(sizeof(20*int));
  // ... CUDA initialization here
  myadd<<<20, dim3(10, 20)>>>(Hdt);
  cudaFree(Hdt);
}

But it doesn't compile, the error message is:

error: no instance of overloaded function "atomicAdd" matches the argument list argument types are: (int, int)

1
This question has nothing to do with CUDA or atomicAdd. It's about understanding how basic arrays and functions work in C++. - Kerrek SB
In my understanding atomicAdd is not needed here since each thread accesses a different memory location (the z in data[z]) - there is no chance of them simultaneously writing to the same memory location. - ejectamenta

1 Answers

8
votes

replace:

atomicAdd(data[z],1);

with

atomicAdd(&data[z],1);

If you look carefully, in the first case you were giving a pointer as first argument to atomicAdd().