To measure metrics/events for CUDA programs, I have tried using the command line like:
nvprof --metrics <<metric_name>>
I also measured the same metrics on the Visual profiler nvvp
. I noticed no difference in the values I get.
I noticed a difference in output when I choose a metric like achieved_occupancy
. But this varies with every execution and that's probably why I get different results each time I run it, irrespective of whether I am using nvvp
or nvprof
.
The question:
I was under the impression that nvvp
and nvprof
are exactly the same, and that nvvp
is simply a GUI built on top of nvprof
for ease of use. However I have been given this advice:
Always use the visual profiler. Never use the command line.
Also, this question says:
I do not want to use the command line profiler as I need the global load/store efficiency, replay and DRAM utilization, which are much more visible in the visual profiler.
Apart from 'dynamic' metrics like achieved_occupancy
, I never noticed any differences in results. So, is this advice valid? Is there some sort of deficiency in the way nvprof
works? I would like to know the advantages of using the visual profiler over the command line form, if there are any.
More specifically, are there metrics for which nvprof
gives wrong results?
Note:
My question is not the same as this or this because these are asking about the difference between nvvp
and Nsight.