ARM SBSA Watchdog Timer Pretimeout Feature Overview
The ARM Server Base System Architecture (SBSA) watchdog timer (WDT) is a critical component in ensuring system reliability by providing a mechanism to recover from system hangs or software failures. The SBSA watchdog timer operates in two modes: single-stage and double-stage. In single-stage mode, the watchdog timer triggers a system reset upon timeout. In double-stage mode, the watchdog timer first triggers a panic (a warning or pretimeout event) before the final reset. The pretimeout feature is designed to provide an early warning before the system reset, allowing the system to log diagnostic information or attempt recovery before the reset occurs.
The current implementation of the sbsa_gwdt
driver in the Linux kernel (version 5.4.25) supports only the single-stage mode, where the action
parameter is set to zero. This means that the driver does not currently support the pretimeout feature, which is inherently tied to the double-stage mode. The pretimeout feature is particularly useful in systems where it is critical to capture diagnostic information or perform corrective actions before a full system reset.
The pretimeout feature is typically implemented in hardware, where the watchdog timer generates an interrupt before the final reset. However, if the hardware does not support pretimeout, it can be emulated in software by configuring the watchdog timer to generate an interrupt at a predefined interval before the timeout. This requires careful coordination between the watchdog timer and the system software to ensure that the pretimeout event is handled correctly.
The sbsa_gwdt
driver currently lacks support for the pretimeout feature, as evidenced by the absence of the WDIOF_PRETIMEOUT
flag in the watchdog_info
structure. This flag is required to enable pretimeout support in the Linux watchdog subsystem. The absence of this flag indicates that the driver does not currently provide a mechanism to handle pretimeout events, either in hardware or software.
Hardware Limitations and Software Emulation of Pretimeout
The primary challenge in implementing the pretimeout feature in the sbsa_gwdt
driver is determining whether the underlying hardware supports pretimeout functionality. The ARM SBSA watchdog timer specification does not explicitly mandate support for pretimeout, which means that some implementations may not include this feature. If the hardware does not support pretimeout, it must be emulated in software.
In hardware that supports pretimeout, the watchdog timer typically has two stages: the first stage generates a pretimeout interrupt, and the second stage triggers the system reset. The pretimeout interrupt can be used to log diagnostic information, attempt recovery, or notify the user of an impending reset. If the hardware does not support pretimeout, the software must configure the watchdog timer to generate an interrupt at a predefined interval before the timeout. This requires careful calculation of the pretimeout interval and coordination with the watchdog timer’s timeout period.
The sbsa_gwdt
driver currently does not provide a mechanism to handle pretimeout events, either in hardware or software. To add support for pretimeout, the driver must be modified to include the WDIOF_PRETIMEOUT
flag in the watchdog_info
structure. This flag informs the Linux watchdog subsystem that the driver supports pretimeout functionality. Additionally, the driver must implement the necessary logic to handle pretimeout events, either by configuring the hardware to generate a pretimeout interrupt or by emulating the pretimeout feature in software.
If the hardware does not support pretimeout, the software must emulate this feature by configuring the watchdog timer to generate an interrupt at a predefined interval before the timeout. This requires calculating the pretimeout interval based on the watchdog timer’s timeout period and configuring the timer accordingly. The software must also implement a mechanism to handle the pretimeout interrupt, such as logging diagnostic information or attempting recovery before the system reset.
Implementing Pretimeout Support in the sbsa_gwdt Driver
To implement pretimeout support in the sbsa_gwdt
driver, the following steps must be taken:
-
Modify the
watchdog_info
Structure: Thewatchdog_info
structure must be updated to include theWDIOF_PRETIMEOUT
flag. This flag informs the Linux watchdog subsystem that the driver supports pretimeout functionality. The modified structure should look like this:static const struct watchdog_info sbsa_gwdt_info = { .identity = WATCHDOG_NAME, .options = WDIOF_SETTIMEOUT | WDIOF_PRETIMEOUT | WDIOF_KEEPALIVEPING | WDIOF_MAGICCLOSE, };
-
Add Pretimeout Support to the Driver: The driver must be modified to support pretimeout functionality. If the hardware supports pretimeout, the driver should configure the watchdog timer to generate a pretimeout interrupt. If the hardware does not support pretimeout, the driver should emulate this feature in software by configuring the watchdog timer to generate an interrupt at a predefined interval before the timeout.
-
Handle Pretimeout Interrupts: The driver must implement a mechanism to handle pretimeout interrupts. This includes logging diagnostic information, attempting recovery, or notifying the user of an impending reset. The pretimeout interrupt handler should be registered with the Linux kernel’s interrupt handling subsystem.
-
Calculate Pretimeout Interval: If the hardware does not support pretimeout, the driver must calculate the pretimeout interval based on the watchdog timer’s timeout period. The pretimeout interval should be a predefined fraction of the timeout period, such as 10% or 20%. The driver should configure the watchdog timer to generate an interrupt at this interval before the timeout.
-
Update the Watchdog Timer Configuration: The driver must update the watchdog timer’s configuration to support pretimeout functionality. This includes setting the pretimeout interval, enabling the pretimeout interrupt, and configuring the watchdog timer to generate a reset upon timeout.
-
Test and Validate: The modified driver should be tested and validated to ensure that the pretimeout feature works as expected. This includes testing both hardware-supported and software-emulated pretimeout functionality.
By following these steps, the sbsa_gwdt
driver can be updated to support the pretimeout feature, providing an early warning before a system reset and improving system reliability and diagnostics.
Conclusion
The ARM SBSA watchdog timer is a critical component in ensuring system reliability, and the pretimeout feature provides an early warning before a system reset. The current implementation of the sbsa_gwdt
driver in the Linux kernel does not support the pretimeout feature, but this can be addressed by modifying the driver to include the WDIOF_PRETIMEOUT
flag and implementing the necessary logic to handle pretimeout events. Whether the hardware supports pretimeout or it must be emulated in software, the driver must be carefully updated to ensure that the pretimeout feature works as expected. By following the steps outlined above, the sbsa_gwdt
driver can be enhanced to support the pretimeout feature, improving system reliability and diagnostics.