Collecting the data for a partiulcar process from PMU for every 1 milli second

Question

I would like to access the Hardware performance counters for a particular PID for every 1 milli second and save the output to a text file.

The below code collects the data of all the processes running in the system in parallel for a certain duration and then outputs it to a text file.

    #!/bin/sh 
    #set -x 
    ps -ef | awk '{printf($2)"\n";}' > out.txt 
    sed '1d' out.txt > tmp 
    IFS=$'\n'
    while read tmp 
    do  
    3>results-$tmp perf stat -p $tmp --log-fd 3 sleep 5 > /dev/null &
    done <tmp

In order to collect the stats for every 1 milli second for a process, how should a loop be written ?

Apart from perf, is there any other way to monitor the details every 1 milli second ? — Susmitha
Use perf stat --interval-print msecs -p $tmp so you're not trying to fork+exec a new perf process every millisecond. The manual says the minimum interval is 10ms, for that, but you could maybe build a custom version. The default timeslice is normally 10ms, so you might a kernel configured differently, maybe with HZ=1000 if Linux still uses HZ. (I haven't paid attention to scheduler tick resolution vs. NO_HZ tickless kernels recently.) — Peter Cordes
perf record --timestamp will record timestamps on events. The manual says "you can use perf report -D to see the timestamps". I think something like perf report might be your best bet for recording things, and then process that data later. If you need something to happen every 1ms, you definitely want to avoid running a shell loop while recording data; that's a lot of system load. Use perf's system-wide mode to have one instance of perf collect data from everything. — Peter Cordes
What exactly do you want to know about your system that you can't learn with a normal perf record --all-cpus --timestamp / perf report -D, or other monitoring tools? — Peter Cordes
perf stat can only log every 10ms. perf record --timestamp records all events as they happen, with high-resolution timestamps. The 10ms limit doesn't apply in any way to perf record. — Peter Cordes

Zulan Zulan · Accepted Answer · 2018-04-12T11:22:34

Reading performance counters at this rate is a bit of a stretch in terms of overhead. That is exactly the reason why perf stat has a lower limit of 10 ms periods. It runs a userspace task for reading the counters in those intervals.

On the other hand, perf record will setup the perf events such that they are recorded by the kernel itself on an overflow of the counter. The advantage is that it has less overhead, but the event is not necessarily recorded in regular time intervals. If you set perf record --frequency 1000, the kernel will adapt the overflow rate of the counter trying to achieve the requested 1 millisecond intervals. The resulting time intervals will not be constant unless your event rate is really stable. If your event rate varies greatly, so will the time intervals.

Note that there is a mechanism in the kernel that will try to prevent perf from causing too much overhead. At your requested rate you will probably hit it.

Also you should not setup recording for an excessive amount of pids, instead setup a system-wide recording e.g.:

perf record --all-cpus --timestamp --freq 1000

You get one result file that you can process according to the pid. perf script. In addition to the text output, perf script allows you to process the events in python or perl (see man perf-script-python, man perf-script-perl).

Collecting the data for a partiulcar process from PMU for every 1 milli second

1 Answers