ARM Non-Reordering Device Memory and IMPLEMENTATION DEFINED SIZE
The ARM architecture, particularly in the context of ARMv8, provides a robust framework for memory management, including the handling of device memory. One of the key attributes of device memory is the non-Reordering (nR) attribute, which ensures that memory accesses to a single peripheral occur in the same order as they appear in the program. However, the concept of IMPLEMENTATION DEFINED SIZE introduces a layer of complexity that can lead to confusion, especially when it comes to understanding whether this size is determined by hardware or software, and how it impacts memory access ordering.
Non-Reordering Attribute and Memory Access Ordering
The non-Reordering attribute is a critical feature in ARM architectures, particularly when dealing with memory-mapped peripherals. This attribute ensures that memory accesses to a single peripheral are not reordered, meaning they arrive at the peripheral in the same sequence as they are issued by the program. This is crucial for peripherals that rely on the order of operations, such as those involved in firmware loading or reset sequences.
The ARMv8 Architecture Reference Manual states:
"For all memory types with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral of IMPLEMENTATION DEFINED size, as defined by the peripheral, must be the same order that occurs in a simple sequential execution of the program."
This statement highlights two key points:
- The non-Reordering attribute applies to memory accesses targeting a single peripheral.
- The size of the memory region associated with the peripheral is IMPLEMENTATION DEFINED.
IMPLEMENTATION DEFINED SIZE: Hardware or Software?
The term "IMPLEMENTATION DEFINED" in the ARM architecture refers to aspects of the implementation that are not strictly defined by the architecture itself but are left to the discretion of the hardware designer. This includes parameters such as the size of memory regions, the behavior of certain instructions, and the configuration of peripherals.
In the context of non-Reordering device memory, the IMPLEMENTATION DEFINED SIZE refers to the size of the memory region that a peripheral occupies. This size is determined by the hardware designer when the peripheral is implemented. It is not something that can be configured or altered by software, such as through page table settings.
The confusion often arises from the interpretation of whether IMPLEMENTATION DEFINED SIZE refers to the size of the memory region (MMIO size) or the size of individual data accesses (e.g., 8-bit, 16-bit, 32-bit accesses). The ARMv8 manual uses the term in both contexts, but in the case of non-Reordering device memory, it primarily refers to the size of the memory region associated with the peripheral.
Memory Access Size and Peripheral Constraints
While the IMPLEMENTATION DEFINED SIZE in the context of non-Reordering device memory refers to the memory region size, it is also important to consider the size of individual data accesses. Peripherals often have specific requirements regarding the size of data accesses they can handle. For example, a peripheral might only support 32-bit accesses, and attempting to perform an 8-bit or 16-bit access could lead to undefined behavior or require the access to be coalesced into a 32-bit access.
The ARMv8 manual provides an example in the context of the GICv2 (Generic Interrupt Controller):
"All registers support 32-bit word accesses with the access type defined in Table 4-1 on page 4-75 and Table 4-2 on page 4-76. In addition, the GICD_IPRIORITYRn, GICD_ITARGETSRn, GICD_CPENDSGIRn, and GICD_SPENDSGIRn registers support byte accesses. Whether any halfword register accesses are permitted is IMPLEMENTATION DEFINED."
This example illustrates that the size of data accesses is also subject to implementation-defined constraints, which are separate from the IMPLEMENTATION DEFINED SIZE of the memory region.
Implications for Software Developers
Understanding the distinction between the IMPLEMENTATION DEFINED SIZE of the memory region and the size of data accesses is crucial for software developers working with ARM-based systems. Misinterpreting these concepts can lead to subtle bugs, especially in scenarios where the order of memory accesses is critical.
For example, consider a scenario where software needs to write to two memory-mapped peripherals that are contiguous in memory. Suppose peripheral A occupies addresses 0x0000-0x1000, and peripheral B occupies addresses 0x1000-0x2000. Both memory regions are marked as Device-nGnRE (non-Reordering, non-Gathering, non-Early Write Acknowledgement).
volatile uint32_t *peripheral_a = (uint32_t *)0x0FFC;
volatile uint32_t *peripheral_b = (uint32_t *)0x1000;
*peripheral_a = 1; /* Assume the memory type is Device-nGnRE */
*peripheral_b = 1; /* Assume the memory type is Device-nGnRE */
In this case, the non-Reordering attribute does not guarantee that the write to peripheral A will occur before the write to peripheral B. The non-Reordering attribute only ensures that accesses to the same peripheral occur in program order. Since peripheral A and peripheral B are separate entities, the order of accesses to them is not guaranteed by the non-Reordering attribute.
This has important implications for scenarios where the order of operations is critical, such as when peripheral A stores firmware for peripheral B. If peripheral B is released from reset before its firmware is fully loaded, it could lead to undefined behavior.
void *fw_dest_addr = (void *)0x0000; /* peripheral A */
volatile uint32_t *reset_addr = (uint32_t *)0x1000; /* peripheral B */
/* Copy peripheral B's firmware to peripheral A's MMIO region */
memcpy(fw_dest_addr, fw_src_addr, 0x1000);
/* Release peripheral B from reset */
*reset_addr = 1;
In this example, the non-Reordering attribute does not guarantee that the memcpy
operation will complete before the write to reset_addr
. To ensure the correct order of operations, software must use explicit synchronization mechanisms, such as memory barriers or data synchronization barriers (DSB).
Clarifying the Original Quote
The original quote from the ARMv8 Architecture Reference Manual has been a source of confusion, particularly regarding whether "IMPLEMENTATION DEFINED SIZE" refers to the size of the memory region or the size of data accesses. The manual states:
"For all memory types with the non-Reordering attribute, the order of memory accesses arriving at a single peripheral of IMPLEMENTATION DEFINED size, as defined by the peripheral, must be the same order that occurs in a simple sequential execution of the program."
The key to understanding this sentence lies in its grammatical structure. The phrase "of IMPLEMENTATION DEFINED size" modifies "memory accesses," not "peripheral." This interpretation is supported by the context provided in the manual, which discusses the order of memory accesses to a single peripheral.
To further clarify, consider the following rephrasing of the sentence with added commas:
"For all memory types with the non-Reordering attribute, the order of memory accesses, arriving at a single peripheral, of IMPLEMENTATION DEFINED size as defined by the peripheral, must be the same order that occurs in a simple sequential execution of the program."
This rephrasing makes it clear that the "IMPLEMENTATION DEFINED size" refers to the size of the memory accesses, not the size of the peripheral’s memory region.
Practical Considerations for System Design
When designing systems that rely on non-Reordering device memory, it is essential to consider both the IMPLEMENTATION DEFINED SIZE of the memory region and the size of data accesses. Here are some practical considerations:
-
Peripheral Memory Region Size: The size of the memory region allocated to a peripheral is determined by the hardware designer and is fixed at the time of implementation. Software cannot alter this size, and it must be respected when accessing the peripheral.
-
Data Access Size: Peripherals may have specific requirements regarding the size of data accesses. For example, a peripheral might only support 32-bit accesses, and attempting to perform an 8-bit or 16-bit access could lead to undefined behavior. Software must ensure that data accesses conform to the peripheral’s requirements.
-
Ordering Guarantees: The non-Reordering attribute only guarantees the order of accesses to the same peripheral. Accesses to different peripherals are not guaranteed to occur in program order, even if both peripherals are marked as non-Reordering. Software must use explicit synchronization mechanisms to enforce ordering between accesses to different peripherals.
-
Synchronization Mechanisms: To ensure the correct order of operations, software should use memory barriers or data synchronization barriers (DSB) when necessary. These mechanisms enforce the completion of memory accesses before proceeding to the next operation.
Conclusion
The concept of IMPLEMENTATION DEFINED SIZE in ARM non-Reordering device memory is a nuanced aspect of the ARM architecture that requires careful consideration. It is essential to distinguish between the size of the memory region allocated to a peripheral and the size of individual data accesses. Both are determined by the hardware designer and cannot be configured by software.
Understanding these concepts is crucial for software developers working with ARM-based systems, particularly in scenarios where the order of memory accesses is critical. By respecting the constraints imposed by the hardware and using appropriate synchronization mechanisms, developers can ensure reliable and predictable system behavior.
In summary:
- IMPLEMENTATION DEFINED SIZE refers to the size of the memory region allocated to a peripheral and is determined by the hardware designer.
- The non-Reordering attribute ensures that memory accesses to the same peripheral occur in program order, but does not guarantee ordering between accesses to different peripherals.
- Software must use explicit synchronization mechanisms, such as memory barriers, to enforce ordering between accesses to different peripherals.
By adhering to these principles, developers can effectively manage memory access ordering in ARM-based systems and avoid subtle bugs that can arise from misinterpretation of the architecture’s specifications.