perfctr is an application that adds support to the Linux kernel (2.4.16 or newer) for using the Performance-Monitoring Counters (PMCs) found in many modern processors.
Supported processors are:
- All Intel Pentium processors, i.e., Pentium, Pentium MMX, Pentium Pro, Pentium II, Pentium III, Pentium M and Pentium 4, including Celeron and Xeon versions.
- The AMD K7 and K8 processor families.
- Cyrix 6x86MX, MII, and III.
- VIA C3 (Cyrix III).
- Centaur WinChip C6/2/3.
- PowerPC 604, 7xx, and 74xx processors.
Here are some key features of "perfctr":
· Each Linux process has its own set of "virtual" PMCs. That is, to a process the PMCs appear to be private and unrelated to the activities of other processes in the system. The virtual PMCs have 64-bit precision, even though current processors only implement 32, 40, or 48-bit PMCs. Each process also has a virtual Time-Stamp Counter (TSC). On most machines, the virtual PMCs can be sampled entirely in user-space without incurring the overhead of a system call.
· A process accesses its virtual PMCs by opening /dev/perfctr and issuing system calls on the resulting file descriptor. A user-space library is included which provides a more high-level interface.
· The driver also supports global-mode or system-wide PMCs. In this mode, each PMC on each processor can be controlled and read. The PMCs and TSC on active processors are sampled periodically and the accumulated sums have 64-bit precision. Global-mode PMCs are accessed via the /dev/perfctr device file; the user-space library provides a more high-level interface.
· The user-space library is accompanied by several example programs that illustrate how the driver and the library can be used.
· Support for performance-counter overflow interrupts is provided for Intel P4 and P6, and AMD K7 and K8 processors.
Kernels older than 2.4.16 are not supported since perfctr-2.6. You can use the previous stable series, perfctr-2.4, if you must use an older kernel, but this has several limitations:
· Older kernels do not support AMD64 (x86-64).
· The performance counters in hyper-threaded P4s/Xeons cannot be used with kernels older than 2.4.15. You'd have to disable hyper-threading or SMP, or restrict yourself to TSC sampling.
· No profiling using counter overflow interrupts, except in 2.4.10 and newer kernels, and some early 2.4-ac/redhat kernels.
· Application code compiled for perfctr-2.4 is not compatible with perfctr-2.6, and vice versa.
· The perfctr-2.4 series does not support 2.6 kernels. Some of these limitations may be fixable. Contact the author if you are willing to fund development in this direction.
· The performance counter interrupt facility requires SMP or uniprocessor APIC support. In the latter case, the BIOS must be reasonably non-buggy. Unfortunately, this is often not the case.
· Neither the kernel driver nor the sample user-space library attempt to hide any processor-specific details from the user.
· This package makes it possible to compute aggregate event and cycle counts for sections of code. Since many x86-type processors use out-of-order execution, it is impossible to attribute exact event or cycle counts to individual instructions.
· Centaur WinChip C6/2/3 support requires that the TSC is disabled.