0
votes

Have GPU code working fine as device functions called with funcname <<< >>>.

Changed code to work with PTX files.

Now all the lines with calls to cudaMemcpyToSymbol all return error code: invalid symbol

here is snippet from the .cu file:

{

__device__  __constant__  void *devInFramePtrs      [20];
__device__  __constant__  void *devOutFramePtrs     [20];
__device__  __constant__  void *devProcFramePtrs    [60];
__device__  __constant__  void *devProcOutFramePtrs [60];
__device__  __constant__ AlgorithmParms *devAlgoParmsPtr;
__device__  __constant__ AlgorithmStats *devStatParmsPtr;           
__device__   float diamondOffsetsGlobal[36];

}

================= in the ptx file i get:

.global .align 4 .b8 devInFramePtrs[80];
.global .align 4 .b8 devOutFramePtrs[80];
.global .align 4 .b8 devProcFramePtrs[240];
.global .align 4 .b8 devProcOutFramePtrs[240];
.global .align 4 .u32 devAlgoParmsPtr;
.global .align 4 .u32 devStatParmsPtr;
.global .align 4 .b8 diamondOffsetsGlobal[144];

================= then the host code is:

err = cudaMemcpyToSymbol("devInFramePtrs",  gDevInFramePtrs, sizeof(void *) * 20, 0, cudaMemcpyHostToDevice);
err = cudaMemcpyToSymbol("devOutFramePtrs", gDevOutFramePtrs, sizeof(void *) * 20, 0, cudaMemcpyHostToDevice);
err = cudaMemcpyToSymbol("devProcFramePtrs", gDevProcFramePtrs, sizeof(FRAME_BASE_TYPE *) * numDevInProcFramePtrs3, 0, cudaMemcpyHostToDevice);
err = cudaMemcpyToSymbol("devProcOutFramePtrs", gDevProcOutFramePtrs, sizeof(FRAME_BASE_TYPE *) * numDevOutProcFramePtrs3, 0, cudaMemcpyHostToDevice);
err = cudaMemcpyToSymbol("diamondOffsetsGlobal", &(diamondOffset[0][0]), sizeof(float) * 36, 0, cudaMemcpyHostToDevice);

========================

all the calls return value 11: invalid symbol

details: Cuda 4.2, running under VS2010, Win7 32 bit application.

=========================

here is the compile script:

"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2010 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin"
-I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include" -G --keep-dir "Debug" -maxrregcount=32 --machine 32 -ptx -o "U:\filterKernel.ptx" "U:\filterKernel.cu"

As I said, only change was to make the PTX file and change the function calls. Also note that i get the error whether the variables are in const store or regular global store.

thanks in advance.

1
Are you calling cudaMemcpyToSymbol after loading the PTX with cuModuleLoadData[Ex]? Also note that using strings to reference symbols (you need to pass the symbol itself) is deprecated after CUDA 4.1, so you probably shouldn't be doing it this way anyway. In your case, I believe you need to use cuModuleGetGlobal to get a device pointer that you can copy to.harrism
I am not sure you can do this with the runtime API. You probably need to use the driver API to fetch the symbol directly from a module after the PTX has been loaded or JIT'd into the contexttalonmies

1 Answers

1
votes

harrism got it right. cudaMemcpyToSymbol is deprecated. Correct way is to call cuModuleGetGlobal to obtain the address on the device and then use cudaMemcpy with that address.

thanks harrism