I would appreciate some help involving CUDA device memory pointers. Basically I want to split my CUDA kernel code into multiple files for readability and because it is a large program. So what I want to do is be able to pass the same device memory pointers to multiple CUDA kernels, not simultaneously. Below is a rough example of what I need
//random.h
class random{
public:
int* dev_pointer_numbers;
};
so the object simply needs to store the pointer to device memory
//random_kernel.cu
__global__ void doSomething(int *values){
//do some processing}
extern "C" init_memory(int *devPtr,int *host_memory,int arraysize)
{
cudaMalloc(&devPtr,arraysize*sizeof(int));
cudaMemcpy(devPtr,host_memory,arraysize*sizeof(int),cudaMemcpyHostToDevice);
}
extern "C" runKernel(int *devPtr){
doSomething<<<1,1>>>(devPtr);
}
and the main file:
//main.cpp
//ignoring all the details etc
random rnd;
void CUDA(int *hostArray)
{
init_memory(rnd.dev_pointer_numbers,hostArray,10);
runKernel(rnd.dev_pointer_numbers);
}
I understand that when I run the kernel code with the object pointer it isnt mapped in device memory so thats why the kernel code fails. What I want to know is how can I store to the pointer to a particular block in device memory in my main file so that it can be reused amongst other cuda kernel files?