Cortex-A9 Core Lockup and Debugger Inability to Halt Execution
When working with ARM Cortex-A9 processors, particularly in a dual-core configuration such as the Cyclone V SoC, one of the most frustrating issues that can arise during debugging is the inability to stop one of the cores. This issue manifests when using debugging tools like ARM DS-5, where the debugger reports errors such as "Unable to stop device Cortex-A9 SMP" or "Cannot stop target." In this scenario, one core (Cortex-A9-0) remains running and unresponsive to debug commands, while the other core (Cortex-A9-1) can be stopped and controlled normally. This behavior severely limits the ability to debug the system, especially in baremetal or RTOS environments where precise control over execution is critical.
The Cortex-A9 core lockup issue is particularly problematic because it prevents developers from inspecting the state of the running core, setting breakpoints, or single-stepping through code. This can occur even when the system appears to be functioning correctly outside of the debugger. The root cause of this issue often lies in the interaction between the debugger, the processor’s debug logic, and the software running on the core. Understanding the underlying causes and implementing effective solutions requires a deep dive into the Cortex-A9 architecture, its debug features, and the specific conditions that can lead to such lockups.
Debug Logic Misconfiguration and Core State Mismatch
One of the primary causes of the Cortex-A9 core lockup issue is a misconfiguration of the processor’s debug logic or a mismatch between the core’s actual state and the state perceived by the debugger. The Cortex-A9 processor includes a Debug Unit that provides features such as breakpoints, watchpoints, and the ability to halt the processor for debugging. However, these features rely on precise synchronization between the debugger and the processor’s internal state. If this synchronization is lost, the debugger may be unable to halt the core, leading to the observed lockup.
A common scenario is when the debugger attempts to halt the core while it is executing code in a privileged mode or handling an exception. The Cortex-A9 processor has multiple execution modes, including User, FIQ, IRQ, Supervisor, Abort, Undefined, and System modes. If the core is in an exception handler or executing secure monitor code, the debugger may not have the necessary permissions to halt the core. Additionally, if the core is in a low-power state or waiting for an event, the debug logic may not be able to interrupt the core’s operation.
Another potential cause is the improper configuration of the Debug Halting Control and Status Register (DHCSR). The DHCSR is a key register that controls the debug behavior of the Cortex-A9 core. It includes bits for enabling debug, halting the core, and stepping through instructions. If the DHCSR is not configured correctly, the debugger may be unable to assert control over the core. For example, if the C_DEBUGEN bit is not set, the debugger will not be able to halt the core. Similarly, if the C_HALT bit is not set, the core will continue executing instructions even if the debugger attempts to halt it.
The interaction between the two cores in a dual-core Cortex-A9 system can also contribute to the issue. In a symmetric multiprocessing (SMP) configuration, the cores share resources such as the L2 cache and memory. If one core is halted while the other continues to run, there can be contention for shared resources, leading to unpredictable behavior. For example, if Cortex-A9-0 is accessing a shared resource while Cortex-A9-1 is halted, Cortex-A9-0 may become stuck waiting for the resource to become available, effectively causing a lockup.
Resolving Debug Logic Issues and Ensuring Core Synchronization
To address the Cortex-A9 core lockup issue, a systematic approach is required to ensure that the debug logic is properly configured and that the cores are synchronized during debugging. The following steps outline a detailed troubleshooting and resolution process:
Step 1: Verify Debugger Configuration and Core State
The first step is to verify that the debugger is properly configured to interact with the Cortex-A9 cores. This includes ensuring that the debugger is using the correct JTAG or SWD interface and that the target device is correctly identified. In DS-5, this can be done by checking the debug configuration settings and ensuring that the correct target device (Cyclone V SoC) is selected.
Next, verify the state of the cores using the debugger. If Cortex-A9-0 is reported as running but cannot be halted, check the core’s execution mode and whether it is handling an exception or operating in a privileged mode. This information can often be obtained from the processor’s status registers, such as the Current Program Status Register (CPSR). If the core is in an exception handler, it may be necessary to modify the exception handling code to allow the debugger to halt the core.
Step 2: Configure the Debug Halting Control and Status Register (DHCSR)
Ensure that the DHCSR is properly configured to allow the debugger to halt the core. This involves setting the C_DEBUGEN bit to enable debug functionality and the C_HALT bit to halt the core. The following code snippet demonstrates how to configure the DHCSR in a baremetal environment:
LDR R0, =0xE000EDF0 ; DHCSR address
LDR R1, =0xA05F0003 ; C_DEBUGEN | C_HALT | DBGKEY
STR R1, [R0] ; Write to DHCSR
If the core still cannot be halted, check whether the DHCSR is being modified by other parts of the code. For example, if the RTOS or application code is disabling debug functionality, this could prevent the debugger from halting the core.
Step 3: Synchronize Cores in SMP Configuration
In a dual-core Cortex-A9 system, it is essential to ensure that both cores are synchronized during debugging. This can be achieved by using inter-core communication mechanisms such as spinlocks or semaphores to coordinate access to shared resources. For example, if Cortex-A9-0 is accessing a shared resource, Cortex-A9-1 should wait until the resource is available before attempting to access it.
Additionally, consider using the Cortex-A9’s built-in synchronization primitives, such as the Data Synchronization Barrier (DSB) and Instruction Synchronization Barrier (ISB) instructions, to ensure that memory accesses are properly ordered. These instructions can be used to prevent one core from accessing a shared resource while the other core is modifying it.
Step 4: Implement Debug Exception Handling
If the core lockup occurs during exception handling, it may be necessary to implement custom debug exception handling code. This code should ensure that the core can be halted by the debugger even when handling exceptions. For example, the following code snippet demonstrates how to enable debug exceptions in the Vector Table:
LDR R0, =0xE000ED24 ; Debug Exception and Monitor Control Register (DEMCR)
LDR R1, [R0]
ORR R1, R1, #0x00000001 ; Set the VC_CORERESET bit
STR R1, [R0]
This code enables debug exceptions, allowing the debugger to halt the core even when it is handling an exception.
Step 5: Use Hardware Breakpoints and Watchpoints
If the core lockup is caused by specific code paths or memory accesses, consider using hardware breakpoints or watchpoints to identify the problematic code. Hardware breakpoints can be set using the Cortex-A9’s Breakpoint Unit, which allows the debugger to halt the core when a specific address is accessed. Similarly, watchpoints can be used to monitor memory accesses and halt the core when a specific memory location is read or written.
To set a hardware breakpoint, use the following code:
LDR R0, =0xE0002000 ; Breakpoint Control Register (BPCR)
LDR R1, =0x00000001 ; Enable breakpoint
STR R1, [R0]
LDR R0, =0xE0002004 ; Breakpoint Address Register (BPAR)
LDR R1, =0x00001000 ; Address to break on
STR R1, [R0]
Step 6: Analyze Power Management Settings
If the core lockup occurs during low-power states, analyze the power management settings of the Cortex-A9 processor. The Cortex-A9 includes power management features such as Wait For Interrupt (WFI) and Wait For Event (WFE) instructions, which can put the core into a low-power state. If the debugger attempts to halt the core while it is in a low-power state, the core may not respond.
To prevent this, ensure that the core is not in a low-power state when the debugger attempts to halt it. This can be done by modifying the power management code to exit low-power states before halting the core. For example:
WFI ; Wait for interrupt
DSB ; Ensure completion of WFI
ISB ; Ensure instruction stream synchronization
Step 7: Update Debugger and Firmware
Finally, ensure that the debugger and firmware are up to date. Debugging tools like DS-5 are regularly updated to address issues and improve compatibility with different ARM processors. Similarly, firmware updates for the Cyclone V SoC may include fixes for known issues related to debug functionality.
By following these steps, developers can systematically address the Cortex-A9 core lockup issue and ensure that the debugger can halt and control both cores during debugging. This approach not only resolves the immediate issue but also provides a framework for diagnosing and addressing similar problems in the future.