CUDA memcheck address - how to determine location in code?

Question

cuda-memcheck is reporting this information for a release mode CUDA kernel:

========= Error: process didn't terminate successfully
========= Invalid __global__ read of size 4
=========     at 0x000002c8 in xx_kernel
=========     by thread (0,0,0) in block (0,0)
=========     Address 0x10101600014 is out of bounds
=========
========= ERROR SUMMARY: 1 error

This fault only happens in release mode. It also doesn't happen when running under cuda-gdb.

How can I take the 0x000002c8 address and determine the code that is causing the fault? I've looked through the cached intermediate files (.ptx, .cubin, etc) and see no obvious way to determine the faulty source code.

This is on x86_64 Linux with CUDA 3.2.

UPDATE: Turns out it was a compiler bug in 3.2. Upgrading to 4.0 makes the memcheck error go away. Also, I was able to disassemble the CUBIN with the cuobjdump from 4.0, but since it was release mode and optimized, it was exceedingly difficult to match the disassembly to the source code.

Can you post your kernel code so that we can see why this thread accesses an out of bound area? — jopasserat
Unfortunately its proprietary source code so I can't post the actual code. Thanks. — dwelch91

user703016 user703016 · Accepted Answer · 2011-06-24T09:18:35

Download the CUDA Toolkit 4.0 from the NVIDIA Developer Zone. Use the new cuobjdump that supports 2.x cubins.

cuobjdump -sass /path/to/your/cubin > /path/to/dump.txt.

Example output (tested on a sm_20 cubin, code version 2.3)

    ...
/*6018*/     /*0xe00100075003ff9a*/     CAL 0x46d8;
/*6020*/     /*0x10001de428000000*/     MOV R0, R4;
/*6028*/     /*0x00001de428000000*/     MOV R0, R0;
/*6030*/     /*0x40011de428000000*/     MOV R4, R16;
    ...

CUDA memcheck address - how to determine location in code?

2 Answers