I wrote a code which is facing kernel launch failure due to Device Illegal Address when I run it using cuda-gdb for a particular input. I ran it using cuda-memcheck and got Invalid write of size 4 error.The code is too big so I will explain the scenario here.
I have a main kernel to which I am passing an array pointer which serves as a stack. I have a device function which is called from the main kernel and uses the stack.
__device__ void find(int v , int* p, int* pv,int n, int* d_stackContents)
{
int d_stackTop;
d_stackTop = -1;
*pv = p[v];
if(*pv == -1){
*pv = v;
}
else{
cuPrintf("Stack top is %d\n",d_stackTop);
d_stackTop = d_stackTop + 1;
d_stackContents[d_stackTop] = v;
cuPrintf("Stack top is %d\n",d_stackTop);
while(*pv != -1){
d_stackTop = d_stackTop + 1;
d_stackContents[d_stackTop] = *pv;
cuPrintf("Stack top is %d\n",d_stackTop);
*pv = p[*pv];
}
}
The error is occurring at d_stackContents[d_stackTop] = *pv;
I am calling the device function in the main kernel as follows:
find(v[idx], p,&pv,n, d_stackContents);
where idx = threadIdx.x + blockDim.x * blockIdx.x and I have declared pv as int pv;
Also, the d_stackContents array is allocated in main using cudaMalloc and passed as an argument to the main kernel