1
votes

"nvprof --query-metrics" gives CUDA profiling error. Says could not find any metrics. How to overcome? My gpu is GTX 960. Operating System Cent OS 6.5. Cuda 6.5, 64 bit version Here is the output given by my machine.

[root@Sekhar finalCodes]# nvprof --query-metrics

Available Metrics: Name Description Device 0 (GeForce GTX 960): ======== Error: CUDA profiling error.

[root@Sekhar finalCodes]# nvprof --analysis-metrics

======== Warning: Metric "stall_imc" cannot be found on device 0.

======== Warning: Metric "stall_compute" cannot be found on device 0.

======== Warning: Metric "stall_texture" cannot be found on device 0.

======== Warning: Metric "stall_other" cannot be found on device 0.

======== Warning: Metric "stall_exec_dependency" cannot be found on device 0.

======== Warning: Metric "stall_inst_fetch" cannot be found on device 0.

======== Warning: Metric "stall_sync" cannot be found on device 0.

and many more lines like this.

All my programs are compiled and executed fine.

Also nvprof ./myFile gives the following output.

==4075== Profiling application: ./myFile

==4075== Profiling result:

Time(%) Time Calls Avg Min Max Name

99.94% 71.093ms 500 142.19us 135.17us 146.46us void collideKernel(SodA, int, int, int)

0.05% 37.151us 9 4.1270us 3.9990us 4.5120us [CUDA memcpy HtoD]

0.01% 7.7760us 2 3.8880us 3.8720us 3.9040us [CUDA memcpy DtoH]

==4075== API calls:

Time(%) Time Calls Avg Min Max Name

75.44% 285.43ms 18 15.857ms 4.3210us 285.35ms cudaMallocPitch

19.14% 72.422ms 1000 72.421us 1.1560us 218.21us cudaEventSynchronize

3.30% 12.491ms 1000 12.490us 706ns 11.523ms cudaEventCreate

0.87% 3.3010ms 500 6.6010us 5.9150us 37.636us cudaLaunch

0.49% 1.8493ms 1000 1.8490us 1.4670us 22.908us cudaEventRecord

0.17% 660.35us 500 1.3200us 1.1920us 4.1100us cudaEventElapsedTime

0.15% 579.85us 83 6.9860us 445ns 264.17us cuDeviceGetAttribute

0.15% 575.57us 1 575.57us 575.57us 575.57us cudaGetDeviceProperties

0.11% 422.92us 2000 211ns 169ns 2.9590us cudaSetupArgument

0.06% 220.54us 11 20.048us 12.854us 62.371us cudaMemcpy2D

0.04% 158.03us 18 8.7790us 3.3490us 81.821us cudaFree

0.04% 155.07us 500 310ns 274ns 1.9820us cudaConfigureCall

1
What metric(s) are you trying to query?talonmies
I have a gtx 960 with CUDA 7.5 on linux, and I have no trouble running nvprof --query-metrics on it. If you want help, you're probably going to have to provide more information. My suggestion would be to provide answers to each of the following (you can edit your question): 1. What OS are you using? 2. What CUDA version are you using? 3. Can you run CUDA codes properly on your GTX960, such as deviceQuery and vectorAdd sample codes? 4. Provide the exact nvprof command and output from that command (copy and paste your session into the question).Robert Crovella
try updating your CUDA version from 6.5 to 7.5. Make sure you have a proper driver installed for CUDA 7.5 i.e. 352.xx or newer.Robert Crovella
I got the metrics with NVIDIA driver 352.63 and Cuda 7.5 in CentOS 6.5. I have another GPU GTX 650Ti, with Cuda 5.5 where even the option --metrics is not recognized. Thanks for the information. A lot to study/interpret with the metrics...Interesting!Thanasekhar Balaiah

1 Answers

1
votes

With NVIDIA Driver 352.63 and Cuda 7.5.18, the metrics are available except a very few metrics like

 "l1_shared_utilization" 
 "alu_fu_utilization" 
 "l2_l1_read_transactions" 
 "l2_l1_write_transactions" 
 "nc_l2_read_transactions" 
 "l2_l1_read_throughput" 
 "l2_l1_write_throughput" 
 "nc_l2_read_throughput" 
 "atomic_throughput". 

Newer driver and toolkit versions gives events and metrics.