ARM Cortex-A72 Performance Counter Access in EL0 (User Mode)
The core issue revolves around accessing the ARM Cortex-A72 cycle counter (PMCCNTR_EL0) from user mode (EL0) on a Linux-based system. The user attempted to read the cycle counter using the MRS
instruction in a C program but encountered an "Illegal Instruction" error. This error indicates that the program is attempting to execute a privileged instruction that is not accessible in the current execution mode (EL0). The ARMv8 architecture enforces strict privilege levels, and certain registers, including performance monitoring registers, are typically accessible only in higher privilege levels (EL1 or above).
The ARM Cortex-A72 processor, part of the ARMv8-A architecture, provides performance monitoring capabilities through a set of Performance Monitor Registers (PMRs). These registers include the Cycle Counter Register (PMCCNTR_EL0), which counts processor cycles and is often used for precise timing measurements. However, access to these registers is restricted based on the current Exception Level (EL). By default, PMCCNTR_EL0 is accessible only in EL1 (kernel mode) or higher, not in EL0 (user mode).
The user’s code attempts to initialize and read the cycle counter using inline assembly in a C program. The initialization involves enabling the Performance Monitor Control Register (PMCR_EL0), resetting the cycle counter, and enabling the cycle counter register (PMCNTENSET_EL0). Despite having root privileges on the Linux system, the program fails to execute the MRS
instruction, resulting in an "Illegal Instruction" error. This is because root privileges in Linux do not equate to EL1 or higher privilege levels in the ARM architecture. Root privileges in Linux operate within EL0, and certain hardware features, such as performance counters, remain inaccessible without explicit kernel support.
Root Privileges vs. ARM Exception Levels and PMU Configuration
The root cause of the issue lies in the misunderstanding of the relationship between Linux user privileges and ARM Exception Levels. Root privileges in Linux do not grant access to privileged ARM registers or instructions. The ARMv8 architecture defines four Exception Levels (EL0 to EL3), with EL0 being the least privileged (user mode) and EL3 being the most privileged (secure monitor mode). Access to performance monitoring registers, such as PMCCNTR_EL0, is restricted to EL1 or higher by default.
The Performance Monitor Unit (PMU) in ARMv8 processors is controlled by several registers, including PMCR_EL0, PMCNTENSET_EL0, and PMUSERENR_EL0. The PMUSERENR_EL0 register is specifically designed to control user-mode access to performance counters. To enable user-mode access to the cycle counter, the PMUSERENR_EL0 register must be configured appropriately. This configuration is typically performed by the kernel, as it requires EL1 privileges.
The user’s code attempts to configure the PMU registers directly from user mode, which is not permitted. The MRS
and MSR
instructions used to read and write these registers are privileged instructions and will generate an "Illegal Instruction" exception when executed in EL0. Additionally, the Linux kernel may restrict access to performance counters for security and stability reasons, even if the hardware supports user-mode access.
Another potential cause is the lack of proper kernel support for user-mode performance counter access. Some Linux distributions may not enable or configure the PMU for user-mode access by default. In such cases, even if the hardware supports it, the kernel must be configured to allow user-mode access to performance counters. This typically involves enabling specific kernel options and modifying the kernel’s PMU initialization code.
Enabling User-Mode Cycle Counter Access and Kernel Configuration
To resolve the issue, the following steps can be taken to enable user-mode access to the ARM Cortex-A72 cycle counter:
Step 1: Verify Kernel Support for User-Mode PMU Access
Ensure that the Linux kernel is configured to support user-mode access to performance counters. This involves checking the kernel configuration for the following options:
CONFIG_PERF_EVENTS
: Enables performance event support in the kernel.CONFIG_HW_PERF_EVENTS
: Enables hardware performance event support.CONFIG_ARM_PMU
: Enables ARM Performance Monitor Unit support.
These options are typically enabled in most modern Linux distributions, but it is important to verify their presence in the kernel configuration. If any of these options are missing, the kernel must be recompiled with the appropriate configuration.
Step 2: Configure PMUSERENR_EL0 for User-Mode Access
The PMUSERENR_EL0 register controls user-mode access to performance counters. To enable user-mode access to the cycle counter, the following bits must be set:
PMUSERENR_EN
(bit 0): Enables user-mode access to performance counters.PMUSERENR_CR
(bit 2): Enables user-mode read access to the cycle counter (PMCCNTR_EL0).
This configuration must be performed in EL1 (kernel mode). A kernel module or a modified kernel can be used to set these bits during system initialization. The following code snippet demonstrates how to configure PMUSERENR_EL0 in a kernel module:
#include <linux/module.h>
#include <linux/kernel.h>
#include <asm/sysreg.h>
static int __init pmu_init(void) {
u64 pmuserenr = read_sysreg(pmuserenr_el0);
pmuserenr |= ARMV8_PMUSERENR_EN | ARMV8_PMUSERENR_CR;
write_sysreg(pmuserenr, pmuserenr_el0);
pr_info("PMUSERENR_EL0 configured for user-mode access\n");
return 0;
}
static void __exit pmu_exit(void) {
pr_info("PMU module unloaded\n");
}
module_init(pmu_init);
module_exit(pmu_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Enable user-mode PMU access");
Step 3: Modify User Code to Use Kernel-Provided Interfaces
Instead of directly accessing the PMU registers from user mode, use the kernel-provided interfaces for performance monitoring. The perf
subsystem in Linux provides a user-friendly API for accessing performance counters. The following example demonstrates how to use the perf
API to measure cycle counts:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <perfmon/pfmlib.h>
#include <perfmon/perf_event.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
int main() {
struct perf_event_attr attr;
memset(&attr, 0, sizeof(attr));
attr.type = PERF_TYPE_HARDWARE;
attr.size = sizeof(attr);
attr.config = PERF_COUNT_HW_CPU_CYCLES;
attr.disabled = 1;
attr.exclude_kernel = 1;
attr.exclude_hv = 1;
int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
if (fd == -1) {
perror("perf_event_open");
return EXIT_FAILURE;
}
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
// Code to measure
int x = 10;
volatile int y = x * x;
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
uint64_t count;
read(fd, &count, sizeof(count));
printf("Cycle count: %lu\n", count);
close(fd);
return 0;
}
Step 4: Recompile and Test
Recompile the kernel or load the kernel module to enable user-mode PMU access. Then, compile and run the user program to verify that the cycle counter can be accessed without errors. If the kernel module approach is used, ensure that the module is loaded before running the user program.
Step 5: Debugging and Validation
If the issue persists, use debugging tools such as gdb
or strace
to trace the program’s execution and identify any remaining issues. Additionally, check the kernel logs (dmesg
) for any errors related to the PMU or performance counters.
By following these steps, user-mode access to the ARM Cortex-A72 cycle counter can be enabled, allowing for precise timing measurements in Linux applications. This approach leverages the kernel’s perf
subsystem, ensuring compatibility and security while providing the necessary functionality.