GIC-500 Stuck State After Cortex-A53 Software Reset

The core issue revolves around the Generic Interrupt Controller (GIC-500) in the Armada 3720 SOC entering a stuck state after a software-initiated reset of the Cortex-A53 cores. The Cortex-M3 secure coprocessor is tasked with orchestrating the reset sequence, which includes resetting the Cortex-A53 cores, peripherals, and the GIC-500. However, the GIC-500 occasionally fails to recover from this reset sequence, specifically when the GICD_CTLR register’s RWP (Register Write Pending) bit remains high indefinitely. This prevents the arm-trusted-firmware from configuring the GIC during the boot process, leading to a system hang.

The Cortex-M3 has full access to all memory-mapped registers, including those of the GIC-500 and the Cortex-A53 cores. The reset sequence involves resetting the first Cortex-A53 core to its BootROM, placing the second Cortex-A53 core in reset, resetting all peripherals, and attempting to reset the GIC-500 via memory-mapped register writes. Despite this, the GIC-500 occasionally enters a state where the RWP bit remains high, indicating that pending register writes have not been propagated. This issue is intermittent, making it challenging to diagnose and resolve.

The GIC-500 is a critical component for interrupt management in the Armada 3720 SOC. When the RWP bit is high, the GIC-500 cannot accept new configuration changes, effectively halting the system’s boot process. This issue is particularly problematic because it does not occur consistently, suggesting a race condition, timing issue, or hardware bug in the GIC-500 reset mechanism.

Memory-Mapped Register Access and RWP Bit Propagation

The root cause of the GIC-500 stuck state lies in the interaction between the Cortex-M3 firmware, the Cortex-A53 cores, and the GIC-500’s internal state machine. The GIC-500 relies on the RWP bit to indicate when register writes are pending. When the RWP bit is high, the GIC-500 is busy processing a previous register write, and new writes are queued until the RWP bit clears. However, during the reset sequence, the GIC-500 may not properly clear the RWP bit, leading to a deadlock.

One possible cause is the timing of the reset sequence. The Cortex-M3 firmware resets the Cortex-A53 cores and peripherals before attempting to reset the GIC-500. If the GIC-500 is still processing interrupts or configuration changes when the reset is initiated, it may enter an undefined state. This is exacerbated by the fact that the Cortex-A53 cores may still be asserting interrupts or accessing the GIC-500 during the reset sequence.

Another potential cause is the lack of proper synchronization between the Cortex-M3 firmware and the GIC-500. The GIC-500 may require specific sequences of register writes to properly reset, and these sequences may not be fully documented or implemented in the Cortex-M3 firmware. Additionally, the GIC-500’s internal state machine may have undocumented dependencies or timing requirements that are not being met during the reset sequence.

The intermittent nature of the issue suggests that it may be related to the GIC-500’s internal state at the time of the reset. If the GIC-500 is in the middle of processing an interrupt or configuration change when the reset is initiated, it may not properly clear the RWP bit. This could be due to a hardware bug in the GIC-500 or an undocumented requirement for resetting the GIC-500.

Implementing GIC-500 Reset Sequences and State Verification

To resolve the GIC-500 stuck state issue, the Cortex-M3 firmware must implement a robust reset sequence that ensures the GIC-500 is properly reset and its internal state is verified before proceeding with the boot process. This involves several steps:

First, the Cortex-M3 firmware must ensure that all Cortex-A53 cores are in a known state before attempting to reset the GIC-500. This includes placing both Cortex-A53 cores in reset and ensuring that they are not accessing the GIC-500 during the reset sequence. The Cortex-M3 firmware can achieve this by writing to the CPU_SOFTWARE_RESET register for the first Cortex-A53 core and configuring the RVBAR register for the second Cortex-A53 core.

Next, the Cortex-M3 firmware must reset all peripherals that may be accessing the GIC-500. This includes disabling interrupts and ensuring that no peripherals are asserting interrupts during the reset sequence. The Cortex-M3 firmware can achieve this by writing to the appropriate peripheral reset registers and disabling interrupts in the GIC-500.

Once the Cortex-A53 cores and peripherals are in a known state, the Cortex-M3 firmware can attempt to reset the GIC-500. This involves writing to the GICD_CTLR register to disable the GIC-500 and then writing to the GICD_CTLR register again to re-enable it. The Cortex-M3 firmware must then poll the RWP bit in the GICD_CTLR register to ensure that it clears before proceeding with the boot process.

If the RWP bit does not clear, the Cortex-M3 firmware must implement a fallback sequence to recover the GIC-500. This may involve writing to additional GIC-500 registers to force a reset or performing a full SOC reset if the GIC-500 cannot be recovered. The Cortex-M3 firmware should also log the state of the GIC-500 registers to aid in debugging the issue.

Finally, the Cortex-M3 firmware must verify that the GIC-500 is in a known state before proceeding with the boot process. This includes checking the GICD_CTLR register to ensure that the RWP bit is clear and that the GIC-500 is properly configured. The Cortex-M3 firmware should also verify that the Cortex-A53 cores and peripherals are in a known state before releasing them from reset.

By implementing these steps, the Cortex-M3 firmware can ensure that the GIC-500 is properly reset and its internal state is verified before proceeding with the boot process. This will prevent the GIC-500 from entering a stuck state and ensure that the system boots reliably.

Step Action Description
1 Reset Cortex-A53 Cores Place both Cortex-A53 cores in reset and ensure they are not accessing the GIC-500.
2 Reset Peripherals Disable interrupts and reset all peripherals that may be accessing the GIC-500.
3 Reset GIC-500 Write to the GICD_CTLR register to disable and re-enable the GIC-500.
4 Poll RWP Bit Poll the RWP bit in the GICD_CTLR register to ensure it clears.
5 Fallback Sequence Implement a fallback sequence to recover the GIC-500 if the RWP bit does not clear.
6 Verify GIC-500 State Verify that the GIC-500 is in a known state before proceeding with the boot process.

This approach ensures that the GIC-500 is properly reset and its internal state is verified, preventing the system from entering a stuck state during the boot process.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *