1
votes

I'm trying to get the number of registers per thread for each of my kernels in my CUDA program. This will help me use the CUDA Occupancy Calculator (http://developer.download.nvidia.com/compute/cuda/CUDA_Occupancy_calculator.xls) as to determine the highest occupancy of the GPU for my program.

However, I have generated the .cubin file using the -cubin flag but I am unable to read it in vim/othertexteditors, as suggested by NVIDIA (http://forums.nvidia.com/index.php?showtopic=31279). Does anyone know how to get to read it?

Thanks

1

1 Answers

4
votes

The easiest solution is to pass -Xptxas -v to nvcc like so

$ nvcc -Xptxas -v foo.cu
ptxas info    : Compiling entry function '_Z9my_kernelPfS_f' for 'sm_10'
ptxas info    : Used 2 registers, 20+16 bytes smem

Alternatively, you can use the cudaFuncGetAttributes API function to obtain the necessary values at runtime.