PAPI - Getting at Hardware Performance Counters
Recently, I wanted to figure out whether or not an application I was analyzing was memory bound or not. While on this quest, I was introduced to Performance Application Programming Interface (PAPI).
There is a rather good HOWTO that shows step-by-step instructions on getting it all running on Debian. The text below is more or less just a short version of that HOWTO, with my thoughts interspersed.
PAPI is a library that hooks into the hardware performance counters, and presents them in a uniform way. Installation is rather simple if you pay attention to the installation instructions.
- Get the kernel source
- Get the perfctr tarball
- Extract the sources, and run the update-kernel script. I really mean this, if you try to be clever and apply the patch by hand, you’ll have a broken source tree. (The script runs patch to fixup some existing kernel files, and then it copies a whole bunch of other files into kernel tree.)
- Configure, build, install, and reboot into the new kernel
- You can modprobe perfctr and see spew in dmesg
That’s it for perfctr. Now PAPI itself…
- Get & extract the source
- ./configure, make, make fulltest, make install-all
That’s it for PAPI. The make fulltest will run the tests. Chances are that they will all either pass or all fail. If they fail, then something is wrong (probably with perfctr). If they pass, then you are all set.
There are some examples in the src/examples directory. Those should get you started with using PAPI. It takes about 100 lines of C to get an arbitrary counter going.
Some other time, I’ll talk more about PAPI, and how I used it in my experiments.