Debugging CUDA kernels

Question

I have an OpenCV application, with additional CUDA(.cu) files which I would like to debug using Parallel NSight. NSight debugging works on CUDA samples (without OpenCV .cpp files), but when I try to start the debugger in my application the debugger loads lots of additional modules ("no symbols loaded") and crashes with this error:

OpenCV Error: Gpu API call (out of memory) in unknown function, file ..\.\
opencv-2.4.4\modules\core\src\gpumat.cpp, line 1415

Also, a window gets opened: "Microsoft Visual c++ Debug Library", with: "Debug error!" and "R6010 abort has been called".

What could be the issue? Could loading of this modules be avoided? I am not sure that they are necessary.

And how to correctly debug CUDA kernels? I know CPU and GPU code cannot be debugged at the same time.

Edit:

I am pretty sure that loading of more than 200 kernels makes it crash. Single gpu::GpuMat declaration has more than 100 kernels(modules) on its own, then SURF, BFM and similar algorithms run the rest...

I´d like to debug only kernels in which I put breakpoints (i.e. my own kernels, not OpenCV ones). Is it possible to exclude other modules/kernels somehow?

Thanks!

Jeff Davis Jeff Davis · Accepted Answer · 2013-04-22T23:28:26

It sounds like symbols have been compiled for all of your OpenCV kernels, and this is not what you want. Make sure you are not building OpenCV with CUDA debug flags. Specifically, you don't want the -g/-G/--debug* flags being passed to nvcc.

Debugging a lot of kernels, while having effects on performance, should not cause crashes. I would recommend upgrading to Nsight 3.0 which is available now from the Nsight Visual Studio Edition Early Access site. Many improvements have been made in this version.

Debugging CUDA kernels

1 Answers