I am working with an array of structure, and each cuda block will use the data of one structure (and only one) and do a lot of computation around it. In order for the program to work I would like to load in shared memory the structure.
I have try to use the memcpy function like that :
struct LABEL_2D{
int a;
float * b[MAX];
};
__shared__ struct LABEL_2D self_label;
if(threadIdx.x == 0){
memcpy(&self_label,
label+(blockIdx.x*sizeof(struct LABEL_2D)),
sizeof(struct LABEL_2D));
}
__syncthreads();
But on execution I got the following error : unspecified launch failure cudaGetLastError()
I am wondering if it is possible to load a structure in shared memory.