ARM AArch64 AES Instructions and Incorrect Decryption Output
The issue revolves around the incorrect implementation of the AES-128 ECB (Electronic Codebook) algorithm using ARM AArch64 cryptographic instructions, specifically the aese
(AES single round encryption), aesd
(AES single round decryption), aesmc
(AES mix columns), and aesimc
(AES inverse mix columns) instructions. The user reports that after encrypting a 128-bit plaintext and subsequently decrypting it, the decrypted output does not match the original plaintext. This discrepancy suggests a fundamental misunderstanding or misapplication of the AES instructions, particularly in the handling of key registers and the sequence of operations during encryption and decryption.
The user’s implementation involves loading a 128-bit plaintext into a SIMD register (v0
), applying the aese
instruction with a key stored in another SIMD register (v1
), and then attempting to reverse the process using aesd
with the same key. However, the decrypted output does not match the original plaintext, indicating a flaw in the approach. The user also observes that the key register (v1
) must contain identical 8-bit values across all its lanes for the operation to yield correct results, which hints at a potential misconfiguration in the key setup or usage.
This issue is critical because AES is a widely used symmetric encryption algorithm, and its correct implementation is essential for secure communication and data protection. Misuse of the ARM cryptographic instructions can lead to vulnerabilities or functional failures in cryptographic applications. The problem is exacerbated by the subtlety of the ARM AArch64 instruction set, where small deviations in register usage or instruction sequencing can lead to significant deviations in output.
Key Register Misconfiguration and Instruction Sequencing Errors
The primary cause of the incorrect decryption output lies in the misconfiguration of the key register and improper sequencing of the AES instructions. The ARM AArch64 cryptographic extensions provide highly optimized instructions for AES operations, but their correct usage requires careful attention to detail, particularly in how keys are loaded and how the instructions are chained together.
Key Register Misconfiguration
The user notes that each 8-bit lane in the key register (v1
) must contain the same value for the AES operations to function correctly. This observation is partially correct but reflects a misunderstanding of how the AES instructions interpret the key register. The aese
and aesd
instructions expect the key register to contain a valid AES round key, which is derived from the original AES key through a key expansion process. The requirement for identical 8-bit values across the register lanes is not a general rule but rather a specific case that arises when using a constant key for testing or debugging purposes.
In a real-world scenario, the key register should contain the appropriate round key for the current AES round, which is typically different for each round and derived from the original key using the AES key schedule. The user’s use of movi v1.16b, #0x09
to initialize the key register with a constant value is a valid debugging technique but does not reflect the actual key setup required for AES encryption and decryption. This misconfiguration leads to incorrect AES round operations, resulting in the observed discrepancy between the decrypted output and the original plaintext.
Instruction Sequencing Errors
The user’s implementation also exhibits errors in the sequencing of the AES instructions. The AES algorithm involves multiple rounds of transformations, including SubBytes, ShiftRows, MixColumns, and AddRoundKey. The ARM AArch64 instructions aese
and aesd
perform a single round of AES encryption and decryption, respectively, but they must be used in conjunction with the appropriate key and in the correct order to achieve the desired result.
In the user’s code, the sequence aese v0.16b, v1.16b
followed by aesd v0.16b, v1.16b
is problematic because it attempts to reverse the encryption process using the same key and without the necessary intermediate steps. The aese
instruction performs an AES round encryption, which includes SubBytes, ShiftRows, MixColumns, and AddRoundKey. The aesd
instruction, on the other hand, performs an AES round decryption, which includes Inverse SubBytes, Inverse ShiftRows, Inverse MixColumns, and AddRoundKey. Using these instructions back-to-back without the correct key and intermediate transformations will not yield the original plaintext.
Additionally, the user’s implementation lacks the necessary key expansion and round key management required for AES. The AES algorithm uses a key schedule to generate round keys from the original key, and each round of encryption or decryption uses a different round key. The user’s approach of using the same key for both encryption and decryption without key expansion is a fundamental flaw that leads to incorrect results.
Correct Key Expansion, Instruction Sequencing, and Debugging Techniques
To resolve the issue and achieve correct AES-128 ECB encryption and decryption using ARM AArch64 instructions, the following steps must be taken:
Key Expansion and Round Key Management
The first step in implementing AES-128 ECB is to perform key expansion to generate the round keys from the original AES key. The key expansion process involves applying the AES key schedule algorithm to derive the round keys used in each round of encryption and decryption. The ARM AArch64 instruction set does not provide direct support for key expansion, so this must be implemented in software.
The key expansion process for AES-128 involves generating 10 round keys from the original 128-bit key. Each round key is 128 bits and is used in a specific round of the AES algorithm. The key expansion algorithm involves applying the AES key schedule, which includes operations such as RotWord, SubWord, and Rcon. The following pseudocode outlines the key expansion process for AES-128:
void KeyExpansion(uint8_t *RoundKey, const uint8_t *Key) {
uint8_t temp[4];
int i = 0;
// The first round key is the original key
while (i < 4 * Nk) {
RoundKey[i] = Key[i];
i++;
}
i = 4 * Nk;
while (i < 4 * Nb * (Nr + 1)) {
temp[0] = RoundKey[i - 4];
temp[1] = RoundKey[i - 3];
temp[2] = RoundKey[i - 2];
temp[3] = RoundKey[i - 1];
if (i % (4 * Nk) == 0) {
// RotWord
uint8_t t = temp[0];
temp[0] = temp[1];
temp[1] = temp[2];
temp[2] = temp[3];
temp[3] = t;
// SubWord
temp[0] = SubWord(temp[0]);
temp[1] = SubWord(temp[1]);
temp[2] = SubWord(temp[2]);
temp[3] = SubWord(temp[3]);
// Rcon
temp[0] ^= Rcon[i / (4 * Nk)];
}
RoundKey[i] = RoundKey[i - 4 * Nk] ^ temp[0];
RoundKey[i + 1] = RoundKey[i + 1 - 4 * Nk] ^ temp[1];
RoundKey[i + 2] = RoundKey[i + 2 - 4 * Nk] ^ temp[2];
RoundKey[i + 3] = RoundKey[i + 3 - 4 * Nk] ^ temp[3];
i += 4;
}
}
Once the round keys are generated, they must be stored in a way that allows easy access during the encryption and decryption processes. In the ARM AArch64 implementation, the round keys can be stored in SIMD registers or memory, depending on the specific requirements of the application.
Correct Instruction Sequencing for AES Encryption and Decryption
The correct sequencing of AES instructions is crucial for achieving the desired encryption and decryption results. The following steps outline the correct sequence of operations for AES-128 ECB encryption and decryption using ARM AArch64 instructions:
AES-128 ECB Encryption
- Load the plaintext into a SIMD register (e.g.,
v0
). - Load the first round key into a SIMD register (e.g.,
v1
). - Perform the initial AddRoundKey operation using the
eor
instruction. - For each of the 10 rounds of AES encryption:
- Apply the
aese
instruction to perform the AES round encryption. - Apply the
aesmc
instruction to perform the MixColumns transformation. - Load the next round key into a SIMD register.
- Perform the AddRoundKey operation using the
eor
instruction.
- Apply the
- After the final round, store the ciphertext from the SIMD register to memory.
AES-128 ECB Decryption
- Load the ciphertext into a SIMD register (e.g.,
v0
). - Load the last round key into a SIMD register (e.g.,
v1
). - Perform the initial AddRoundKey operation using the
eor
instruction. - For each of the 10 rounds of AES decryption:
- Apply the
aesd
instruction to perform the AES round decryption. - Apply the
aesimc
instruction to perform the Inverse MixColumns transformation. - Load the next round key into a SIMD register.
- Perform the AddRoundKey operation using the
eor
instruction.
- Apply the
- After the final round, store the plaintext from the SIMD register to memory.
The following code snippet demonstrates the correct sequencing of AES-128 ECB encryption and decryption using ARM AArch64 instructions:
// AES-128 ECB Encryption
ld1 {v0.16b}, [x0] // Load plaintext into v0
ld1 {v1.16b}, [x1] // Load first round key into v1
eor v0.16b, v0.16b, v1.16b // Initial AddRoundKey
// Perform 10 rounds of AES encryption
mov w2, 9 // Number of rounds minus one
1:
aese v0.16b, v1.16b // AES round encryption
aesmc v0.16b, v0.16b // MixColumns
ld1 {v1.16b}, [x1, #16]! // Load next round key
eor v0.16b, v0.16b, v1.16b // AddRoundKey
subs w2, w2, 1 // Decrement round counter
b.ne 1b // Repeat for all rounds
aese v0.16b, v1.16b // Final round encryption
ld1 {v1.16b}, [x1, #16]! // Load final round key
eor v0.16b, v0.16b, v1.16b // Final AddRoundKey
st1 {v0.16b}, [x2] // Store ciphertext
// AES-128 ECB Decryption
ld1 {v0.16b}, [x2] // Load ciphertext into v0
ld1 {v1.16b}, [x1] // Load last round key into v1
eor v0.16b, v0.16b, v1.16b // Initial AddRoundKey
// Perform 10 rounds of AES decryption
mov w2, 9 // Number of rounds minus one
1:
aesd v0.16b, v1.16b // AES round decryption
aesimc v0.16b, v0.16b // Inverse MixColumns
ld1 {v1.16b}, [x1, #16]! // Load next round key
eor v0.16b, v0.16b, v1.16b // AddRoundKey
subs w2, w2, 1 // Decrement round counter
b.ne 1b // Repeat for all rounds
aesd v0.16b, v1.16b // Final round decryption
ld1 {v1.16b}, [x1, #16]! // Load final round key
eor v0.16b, v0.16b, v1.16b // Final AddRoundKey
st1 {v0.16b}, [x0] // Store plaintext
Debugging Techniques and Best Practices
To ensure the correct implementation of AES-128 ECB using ARM AArch64 instructions, the following debugging techniques and best practices should be employed:
-
Verify Key Expansion: Ensure that the key expansion process generates the correct round keys. Compare the generated round keys with known test vectors to confirm their correctness.
-
Check Instruction Sequencing: Verify that the sequence of AES instructions matches the expected sequence for AES encryption and decryption. Pay particular attention to the use of
aese
,aesd
,aesmc
, andaesimc
instructions. -
Validate Intermediate Results: After each round of AES encryption or decryption, validate the intermediate results against known test vectors. This helps identify any deviations from the expected behavior early in the process.
-
Use Debugging Tools: Utilize ARM debugging tools such as ARM DS-5 or GDB with ARM support to step through the code and inspect the contents of SIMD registers at each stage of the AES process.
-
Test with Known Vectors: Test the implementation with known plaintext, key, and ciphertext vectors to ensure that the encryption and decryption processes produce the correct results.
-
Optimize for Performance: Once the implementation is verified, consider optimizing the code for performance by minimizing memory accesses, leveraging SIMD parallelism, and reducing instruction count where possible.
By following these steps and best practices, the correct implementation of AES-128 ECB using ARM AArch64 cryptographic instructions can be achieved, ensuring both functional correctness and optimal performance.