CUDA thread code largely follows C and C++ syntax patterns. So you can easily print out the numerical value of a pointer in kernel code:
printf("pval = %p\n", my_pointer);
If you wanted to do this across threads in a CUDA kernel, you could do:
__global__ void my_kernel(int *data){
int idx = threadIdx.x+blockDim.x*blockIdx.x;
printf("thread: %d, pointer: %p, value: %d\n", idx, &(data[idx]), data[idx]);
}
or similar. Obviously this will create large amounts of output if you use large numbers of threads, and be aware that in-kernel printf
uses a buffer of limited size.