ARM Cortex-M Exclusive Access Branch Out of Range Error

The issue at hand revolves around an assembly code snippet provided in the CoreSight Components Technical Reference Manual (ARM DDI 0314H) on page 310. The code is intended to demonstrate how to perform an exclusive write to the stimulus port of an ARM Cortex-M processor. However, when attempting to assemble the code, the assembler raises a ‘branch out of range’ error, specifically related to the CZBEQ and CZBNE instructions. These instructions are conditional branch instructions that are used to check the status of a FIFO (First In First Out) buffer and to retry the operation if the FIFO is not ready or if the exclusive lock fails.

The CZBEQ and CZBNE instructions have a specific limitation: the branch destination must be within 4 to 130 bytes after the instruction and in the same execution state. In the provided code snippet, the branch destination is before the instruction, which violates this constraint. This limitation is a critical aspect of the ARM architecture’s branch instruction encoding, and failing to adhere to it results in the assembler error.

The code snippet in question is as follows:

; R0 = FIFO-full/exclusive status
; R1 = base of ITM stimulus ports
; R2 = value to write
retry
LDREX R0,[R1,#??] ; read FIFO status and request excl lock
CZBEQ R0,retry ; FIFO not ready, try again
STREX R0,R2,[R1,#??] ; store if FIFO !Full and excl lock
CZBNE R0,retry ; excl lock failed, try again

The LDREX (Load Exclusive) and STREX (Store Exclusive) instructions are part of the ARM architecture’s exclusive access mechanism, which is used to implement atomic operations in a multi-core or multi-threaded environment. The LDREX instruction loads a value from memory and requests an exclusive lock on that memory location, while the STREX instruction attempts to store a value to that memory location only if the exclusive lock is still held.

The CZBEQ and CZBNE instructions are used to check the status of the FIFO buffer and to retry the operation if necessary. The CZBEQ instruction branches to the retry label if the FIFO is full (i.e., if the value in register R0 is zero), while the CZBNE instruction branches to the retry label if the exclusive lock fails (i.e., if the value in register R0 is non-zero after the STREX instruction).

The issue arises because the retry label is located before the CZBEQ and CZBNE instructions, which means that the branch destination is out of range for these instructions. This is a common issue when working with ARM assembly, as the branch instructions have a limited range due to the fixed-width instruction encoding used in the ARM architecture.

Branch Destination Constraints and Instruction Encoding

The ARM architecture uses a fixed-width instruction encoding, which means that each instruction is encoded in a fixed number of bits (typically 32 bits for ARM instructions and 16 bits for Thumb instructions). This fixed-width encoding imposes certain limitations on the range of branch instructions, as the branch offset must be encoded within the instruction itself.

For the CZBEQ and CZBNE instructions, the branch offset is encoded as a signed immediate value, which specifies the number of bytes to branch relative to the current program counter (PC). The range of the branch offset is limited by the number of bits available in the instruction encoding. In the case of the CZBEQ and CZBNE instructions, the branch offset must be within 4 to 130 bytes after the instruction.

This limitation is due to the fact that the branch offset is encoded as a 7-bit signed immediate value, which gives a range of -64 to +63 in terms of halfwords (since ARM instructions are 4 bytes wide, this translates to a range of -128 to +126 bytes). However, the CZBEQ and CZBNE instructions are Thumb-2 instructions, which use a slightly different encoding scheme. In Thumb-2, the branch offset is encoded as a 12-bit signed immediate value, which gives a range of -2048 to +2047 bytes. However, the CZBEQ and CZBNE instructions are conditional branches, which have a more limited range due to the need to encode the condition code in the instruction.

In the provided code snippet, the retry label is located before the CZBEQ and CZBNE instructions, which means that the branch destination is out of range for these instructions. This is because the branch offset would need to be a negative value, which is not supported by the CZBEQ and CZBNE instructions.

Reorganizing Code to Adhere to Branch Constraints

To resolve the ‘branch out of range’ error, the code must be reorganized so that the branch destination is within the allowed range for the CZBEQ and CZBNE instructions. This can be achieved by moving the retry label to a location that is within 4 to 130 bytes after the CZBEQ and CZBNE instructions.

One possible solution is to use a different branch instruction that has a larger range, such as the B (Branch) instruction. The B instruction has a much larger range than the CZBEQ and CZBNE instructions, as it uses a 24-bit signed immediate value for the branch offset, which gives a range of -16MB to +16MB. However, the B instruction is an unconditional branch, which means that it cannot be used directly to replace the CZBEQ and CZBNE instructions.

Instead, the B instruction can be used in combination with a conditional instruction to achieve the same effect. For example, the CMP (Compare) instruction can be used to compare the value in register R0 with zero, and the B instruction can be used to branch to the retry label if the comparison result is true.

Here is an example of how the code can be reorganized to use the B instruction:

; R0 = FIFO-full/exclusive status
; R1 = base of ITM stimulus ports
; R2 = value to write
retry
LDREX R0,[R1,#??] ; read FIFO status and request excl lock
CMP R0, #0 ; compare FIFO status with zero
BEQ retry ; branch to retry if FIFO is full
STREX R0,R2,[R1,#??] ; store if FIFO !Full and excl lock
CMP R0, #0 ; compare STREX result with zero
BNE retry ; branch to retry if STREX failed

In this revised code, the CMP instruction is used to compare the value in register R0 with zero, and the BEQ (Branch if Equal) and BNE (Branch if Not Equal) instructions are used to branch to the retry label if the comparison result is true. The BEQ and BNE instructions have a much larger range than the CZBEQ and CZBNE instructions, which allows the retry label to be located anywhere within the -16MB to +16MB range.

This solution resolves the ‘branch out of range’ error by ensuring that the branch destination is within the allowed range for the branch instructions. However, it is important to note that this solution may introduce additional overhead due to the use of the CMP instruction, which is not present in the original code. This overhead is generally negligible, but it should be taken into consideration when optimizing performance-critical code.

Implementing Data Synchronization Barriers and Cache Management

In addition to resolving the ‘branch out of range’ error, it is also important to consider the implications of the exclusive access mechanism on data synchronization and cache management. The LDREX and STREX instructions are used to implement atomic operations, but they do not provide any guarantees about the ordering of memory accesses or the consistency of the cache.

To ensure that memory accesses are properly ordered and that the cache is consistent, it may be necessary to use data synchronization barriers (DSB) and cache management instructions. The DSB instruction ensures that all memory accesses before the barrier are completed before any memory accesses after the barrier are executed. This is important when working with exclusive access, as it ensures that the exclusive lock is properly acquired and released.

Here is an example of how the code can be modified to include a DSB instruction:

; R0 = FIFO-full/exclusive status
; R1 = base of ITM stimulus ports
; R2 = value to write
retry
LDREX R0,[R1,#??] ; read FIFO status and request excl lock
CMP R0, #0 ; compare FIFO status with zero
BEQ retry ; branch to retry if FIFO is full
DSB ; data synchronization barrier
STREX R0,R2,[R1,#??] ; store if FIFO !Full and excl lock
CMP R0, #0 ; compare STREX result with zero
BNE retry ; branch to retry if STREX failed

In this revised code, the DSB instruction is inserted before the STREX instruction to ensure that all previous memory accesses are completed before the STREX instruction is executed. This ensures that the exclusive lock is properly acquired and that the memory access is properly ordered.

In addition to the DSB instruction, it may also be necessary to use cache management instructions to ensure that the cache is consistent. For example, the DMB (Data Memory Barrier) instruction can be used to ensure that all memory accesses before the barrier are completed before any memory accesses after the barrier are executed. The ISB (Instruction Synchronization Barrier) instruction can be used to ensure that all previous instructions are completed before any subsequent instructions are executed.

Here is an example of how the code can be modified to include cache management instructions:

; R0 = FIFO-full/exclusive status
; R1 = base of ITM stimulus ports
; R2 = value to write
retry
LDREX R0,[R1,#??] ; read FIFO status and request excl lock
CMP R0, #0 ; compare FIFO status with zero
BEQ retry ; branch to retry if FIFO is full
DMB ; data memory barrier
DSB ; data synchronization barrier
STREX R0,R2,[R1,#??] ; store if FIFO !Full and excl lock
CMP R0, #0 ; compare STREX result with zero
BNE retry ; branch to retry if STREX failed
ISB ; instruction synchronization barrier

In this revised code, the DMB instruction is inserted before the DSB instruction to ensure that all previous memory accesses are completed before the DSB instruction is executed. The ISB instruction is inserted after the STREX instruction to ensure that all previous instructions are completed before any subsequent instructions are executed.

These modifications ensure that the exclusive access mechanism is properly synchronized and that the cache is consistent, which is critical for the correct operation of the code. However, it is important to note that these instructions introduce additional overhead, which should be taken into consideration when optimizing performance-critical code.

Conclusion

The ‘branch out of range’ error in the provided assembly code snippet is caused by the branch destination being located before the CZBEQ and CZBNE instructions, which violates the branch offset constraints of these instructions. This issue can be resolved by reorganizing the code to ensure that the branch destination is within the allowed range for the branch instructions. Additionally, it is important to consider the implications of the exclusive access mechanism on data synchronization and cache management, and to use data synchronization barriers and cache management instructions as necessary to ensure the correct operation of the code.

By carefully analyzing the constraints of the ARM architecture and the specific requirements of the code, it is possible to resolve the ‘branch out of range’ error and ensure that the code operates correctly and efficiently. This approach can be applied to other similar issues in ARM assembly programming, and it highlights the importance of understanding the underlying architecture and instruction set when working with low-level code.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *