0
votes

According to the documentation for event/summary mode of nvprof, the output looks like:

==6461== Profiling application: matrixMul 
==6461== Profiling result: 
==6461== Event result: 
//The outputs 

==6461== Metric result: 
//The outputs

The default should show the latencies, percentages, etc for API calls and kernels under Profiling result. So there are two questions:

  1. Why isn't any ouput under Profiling Result?
  2. How do I get nvprof to output Profiling Result also?
1
Large program with many kernel calls?kangshiyin
@Eric: Yes. The answer was quite comprehensive though.user3813674

1 Answers

1
votes

Why isn't any ouput under Profiling Result?

According to the documentation, it states:

nvprof operates in one of the modes listed below.

Those modes are:

  • 3.1.1 Summary Mode (the default)
  • 3.1.2 GPU-Trace and/or API-Trace Modes
  • 3.1.3 Event/metric Summary Mode
  • 3.1.4 Event/metric Trace Mode

Your excerpted info is from 3.1.3 Event/metric Summary Mode. When you are in this mode you are not in any of the other modes, and the data collection (and output) description for the other modes does not apply.

How do I get nvprof to output Profiling Result also?

If you want to capture metric info on a per-kernel basis, use 3.1.4 Event/metric Trace Mode. Output will then appear in the Profiling Result section.

For other combinations, it's not possible to get nvprof to display an arbitrary collection of profiling data in a single run. If you require output that is only available in a particular mode, you will need to run in that mode to get that output. You may need to run nvprof multiple times to get all the output info or data that you'd like to collect. nvvp (the visual profiler) does this (i.e. will run nvprof multiple times, under the hood) in order to display a greater range of data for a given application view.