1
votes

I have the following code in Cuda (it's not the full code). I'm trying to check if it copies properly the arrays from host to device and from device to host.

flVector is initialized with a few numbers as well as indeces.

The pass function needs to copy flVector and indeces to the device memory. In the main, after I'm calling to pass function, I'm trying to copy again the arrays but now from device to host, and then print the values to check if the values are correct.

flat_h returns properly and the values are correct, but indeces returns with garbage values, and i don't know what is the problem with the code.

to return from the pass function two variables I used the return command to return flOnDevice, and i'm also passing a pointer to inOnDevice to save this array. this two variables are on the device side, and then i'm trying to copy them back to host. this is just a check to see that everything is going properly.. but when I print the inOnDevice i'm getting garbage values. why?

 int* pass(vector<int>& flVector, int* indeces, int inSize, int*   inOnDevice)
 {
   int* flOnDevice;

   cudaMalloc((void**) &(flOnDevice), sizeof(int) * flVector.size());

   cudaMemcpy(flOnDevice, &flVector[0], flVector.size()*sizeof(int),cudaMemcpyHostToDevice);

   cudaMalloc((void**) &(inOnDevice), sizeof(int) * inSize);

   cudaMemcpy(inOnDevice, indeces, inSize*sizeof(int), cudaMemcpyHostToDevice);
   return flOnDevice;
}

void main()
{
    int* insOnDevice = NULL;
    int* flOnDevice;

    flOnDevice = pass(flVector, indeces, indSize, inOnDevice);

    int* flat_h = (int*)malloc(flVector.size()*sizeof(int));
    int* inde_h = (int*)malloc(inSize*sizeof(int));


    cudaMemcpy(flat_h,flOnDevice,flVector.size()*sizeof(int),cudaMemcpyDeviceToHost);
    cudaMemcpy(inde_h,inOnDevice,inSize*sizeof(int),cudaMemcpyDeviceToHost);

    printf("flat_h: \n\n");
    for (int i =0; i < flVector.size(); i++)
        printf("%d, " , flat_h[i]);
    printf("\n\ninde_h: \n\n");
    for (int i =0; i < inSize; i++)
        printf("%d, " , inde_h[i]);
    printf("\n\n");
}
1
There must be some mistake in code which you are not showing it here. Please give a complete reproducer, otherwise I dont think anyone will be able to help you!!Sagar Masuti

1 Answers

2
votes

This is not doing what you think it is:

int* pass(vector<int>& flVector, int* indeces, int inSize, int*   inOnDevice)
{
...
  cudaMalloc((void**) &(inOnDevice), sizeof(int) * inSize);

When you pass a pointer to a function this way, you are passing the pointer by value.

If you then take the address of that pointer-passed-by-value inside the function, that address has no connection to anything in the function calling context. Inside the function pass, there is a local copy of *inOnDevice, and you are modifying that local copy with the subsequent cudaMalloc operation.

Instead, you need to pass a pointer-to-a-pointer in this situation (simulated pass-by-reference) or else pass by reference. For the pointer-to-a-pointer example, it would look something like this:

int* pass(vector<int>& flVector, int* indeces, int inSize, int**   inOnDevice)
{
...
  cudaMalloc((void**) inOnDevice, sizeof(int) * inSize);

  cudaMemcpy(*inOnDevice, indeces, inSize*sizeof(int), cudaMemcpyHostToDevice);

And in main:

flOnDevice = pass(flVector, indeces, indSize, &inOnDevice);

And I think if you had used proper cuda error checking as I suggested to you before, you would have seen an error returned from this line of code:

cudaMemcpy(inde_h,inOnDevice,inSize*sizeof(int),cudaMemcpyDeviceToHost);