0
votes

I've got compiled an program with simple openACC derivatives. The compilation is fine with no error. However, when I ran the progam, there is a generic "call to cuStreamSynchronize returned error 700: Illegal address during kernel execution" error.

I ran the cuda-memcheck and get the following errors. Is there anyone can help me identify the issues?

========= CUDA-MEMCHECK
simpleGridingRatio: 300
========= Invalid __global__ read of size 4
=========     at 0x000007a8 in /home/forwardSolver/ChannelCppSolver.h:135:void linearDiscretization_135_gpu<double>(caseProp<double>&, std::vector<double, std::allocator<double>>&, std::vector<double, std::allocator<double>>&, std::vector<double, std::allocator<double>>&, std::vector<double, std::allocator<double>>&)
=========     by thread (0,0,0) in block (0,0,0)
=========     Address 0x7ffca4f9a7b0 is out of bounds
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so (cuLaunchKernel + 0x2fe) [0x28187e]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch3 + 0x1d59) [0x1a64a]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so [0x1b392]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch + 0x13a) [0x1b4ce]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccg.so (__pgi_uacc_launch + 0x1ff) [0x18f92]
=========     Host Frame:./ChannelCppProposal [0x2ffd5]
=========     Host Frame:./ChannelCppProposal [0x2dfe4]
=========     Host Frame:./ChannelCppProposal [0x2dd77]
=========     Host Frame:./ChannelCppProposal [0x2fcc5]
=========     Host Frame:./ChannelCppProposal [0x2eaf7]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xe7) [0x21b97]
=========     Host Frame:./ChannelCppProposal [0x65fa]
=========
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to "unspecified launch failure" on CUDA API call to cuStreamSynchronize. 
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so (cuStreamSynchronize + 0x165) [0x281355]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch3 + 0x20c9) [0x1a9ba]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so [0x1b392]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch + 0x13a) [0x1b4ce]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccg.so (__pgi_uacc_launch + 0x1ff) [0x18f92]
=========     Host Frame:./ChannelCppProposal [0x2ffd5]
=========     Host Frame:./ChannelCppProposal [0x2dfe4]
=========     Host Frame:./ChannelCppProposal [0x2dd77]
=========     Host Frame:./ChannelCppProposal [0x2fcc5]
=========     Host Frame:./ChannelCppProposal [0x2eaf7]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xe7) [0x21b97]
=========     Host Frame:./ChannelCppProposal [0x65fa]
=========
Failing in Thread:1
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to "unspecified launch failure" on CUDA API call to cuCtxSynchronize. 
=========     Saved host backtrace up to driver entry point at error
call to cuStreamSynchronize returned error 719: Launch failed (often invalid pointer dereference)

=========     Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so (cuCtxSynchronize + 0x152) [0x258c22]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_error_handler + 0x258) [0xef30]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch3 + 0x20ec) [0x1a9dd]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so [0x1b392]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccn.so (__pgi_uacc_cuda_launch + 0x13a) [0x1b4ce]
=========     Host Frame:/opt/pgi/linux86-64-llvm/19.4/lib/libaccg.so (__pgi_uacc_launch + 0x1ff) [0x18f92]
=========     Host Frame:./ChannelCppProposal [0x2ffd5]
=========     Host Frame:./ChannelCppProposal [0x2dfe4]
=========     Host Frame:./ChannelCppProposal [0x2dd77]
=========     Host Frame:./ChannelCppProposal [0x2fcc5]
=========     Host Frame:./ChannelCppProposal [0x2eaf7]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xe7) [0x21b97]
=========     Host Frame:./ChannelCppProposal [0x65fa]
=========
========= ERROR SUMMARY: 3 errors
1

1 Answers

1
votes

"Illegal address during kernel execution" is similar to a segmentation violation (segv) on the host where a bad address is being used.

While I can't be sure, but "Address 0x7ffca4f9a7b0" looks to me to be a host address.

Also from the signature for linearDiscretization_135_gpu, it looks like you're using vectors in your code. How are you managing the data for these vectors? Vectors are an opaque class with three pointers. Given OpenACC data regions perform shallow copies, if you include a vector in a data clause, only the pointers will be copied, not the data they point to. So if I'm correct about the host address, one possible cause would be that you're copying a vector, which copies the host pointer address, which cause the illegal address error on the device.

For vectors, you either need to perform a manual deep copy, or if you're using PGI, try compiling with "-ta=tesla:managed" so CUDA Unified Memory will be used. The Vector pointers used will then be a unified address accessible on both the host and device.

Granted this is pure guess work so you may need to do more investigation. You might try setting the environment variable PGI_ACC_DEBUG=1 (for PGI) or CRAY_ACC_DEBUG=1 (for Cray) to have the runtime print detailed information. Not sure if GNU has an equivalent env variable for their OpenACC implementation.

If you need more help investigating, please provide a small reproducing example and we can see if we can determine what's wrong.