1
votes

I want to use PAPI 5.5.1 to check out performances about some of my text algorithms in order to observe differences about cycles, number of mispredictions and pipeline. I have checked the native events provided by the library and I have verified that they are supported by my computer with the tools papi_avail.

It works well when I try to measure up to 5 native events max simultaneously. But after, PAPI refuse to start counting.

Here is my simplified stuff :

#define NB_EVENTS 6
int CS_Events[NB_EVENTS] = {PAPI_BR_INS, PAPI_BR_MSP, PAPI_TOT_CYC, PAPI_TOT_INS, PAPI_RES_STL, PAPI_TOT_IIS};
PAPI_start_counters(CS_Events, NB_EVENTS); /* return PAPI_ECNFLCT */

PAPI_ECNFLCT 's description is :

Hardware event exists, but cannot be counted due to counter resource limitations


I didn't find anymore about this on PAPI/perf documentation. (I'm interesting about x86(32|64) and ARM processors).

  • So it seems there's an hardware limit for counters ?

  • Does it exists a table processor/limit's value to know it ?

  • Is there any other way to do this?

1
Yes there is a hardware limit but it's rather complicated. Trying is certainly the easiest way to find out. If you specify your microarchitecture in the question there is a higher chance for a specific answer.Zulan
You have mistaken native for preset events. For example, PAPI_BR_INS is a preset event which maps to a native event BR_INST_RETIRED:ALL_BRANCHES on my Skylake CPU. The accepted answer (stackoverflow.com/a/51099181/1319478) talks about native events. Sometimes, preset events map to exactly one native events, but in other cases, it is not true.quepas

1 Answers

1
votes

PAPI (and ohter performance monitoring libraries) are based on Hardware performance counters. Basically, you program some Machine Specific Register for monitoring a set of events. Obviously, the number of events that you can monitor is limited on a number between 4 and 8, depending also if you have hyperthreading enable or not (i.e. with hyperthreading disabled you have access to more ). As a comments pointed out, knowing the architecture you're using, you can know the numebr of available counter by looking at the documentation.

If you want to mointor more events than the number available counters, than you should use a tecnique called multiplexing. You can do this with PAPI, but I never tried: http://icl.cs.utk.edu/projects/papi/wiki/Multiplexing .

You can also try to use perf. It's very good and has a trace utility.