I'm working in a GPU Kernel and I have some problems copying data from global to local memory here is my kernel function:
__kernel void nQueens( __global int * data, __global int * result, int board_size)
so I want to copy from __global int * data to __local int aux_data[OBJ_SIZE] I tried to copy like a normal array:
for(int i = 0; i < OBJ_SIZE; ++i)
{
aux_data[stack_size*OBJ_SIZE + i] = data[index*OBJ_SIZE + i];
}
and also with the functions to copy:
event_t e = async_work_group_copy ( aux_data, (data + (index*OBJ_SIZE)), OBJ_SIZE, 0);
wait_group_events (1, e);
And in both situations I get different values between the global and local memory. I don't know what I'm doing wrong...