Executive Summary (TL;DR)

The `keccak` crate for Rust contained a critical unsoundness in its optional ARMv8 assembly optimization. Developers used post-indexing assembly instructions that modified registers (x0, x1, x8) but told the compiler these registers were immutable inputs (`in`). This lie to the compiler constitutes Undefined Behavior, potentially causing the optimizer to generate broken code that corrupts memory or miscalculates cryptographic states.

unsafe { asm!( // ... instructions omitted ... "st1.1d {v20-v23}, [x0], #32", // <--- HARDWARE MODIFIES x0 "st1.1d {v24}, [x0]", // The Lie: in("x0") state.as_mut_ptr(), in("x1") crate::RC[24-round_count..].as_ptr(), in("x8") round_count, // ... ); }

unsafe { asm!( // ... same instructions ... "st1.1d {v20-v23}, [x0], #32", "st1.1d {v24}, [x0]", // The Truth: inout("x0") state.as_mut_ptr() => _, inout("x1") crate::RC[24-round_count..].as_ptr() => _, inout("x8") round_count => _, // ... ); }

So how do we weaponize this? In the current version of the Rust compiler and LLVM, you might get lucky. The generated code might not reuse x0 immediately after the block, or it might reload it anyway due to register pressure. That is why this is classified as 'Unsoundness' rather than a critical RCE yet.

However, to an attacker or a researcher, this is a Time Bomb.

Imagine a scenario where the compiler optimization level is set to -O3. The optimizer sees that x0 holds the address of state. After the assembly block, the code might do something like state[0] = 0.

The Setup: The compiler sees x0 is in. It assumes x0 is preserved.
The Optimize: Instead of reloading the address of state for the subsequent write, the compiler emits instructions to write to [x0], assuming x0 still points to the start of the buffer.
The Reality: The assembly block incremented x0 by 32 bytes.
The Corruption: The write state[0] = 0 actually writes to state[4] (assuming 64-bit words).

In a cryptographic context, this is catastrophic. We aren't just crashing; we are corrupting the internal state of a hash function. If we can control the input to influence the loop count or the flow, we might be able to desynchronize the state enough to weaken the hash, leak key material (if used in a MAC), or cause a buffer overflow if x0 is incremented past the bounds of the valid memory region.

Product

Affected Versions

Fixed Version

keccak

RustCrypto

< 0.1.6

0.1.6

Attribute

Detail

Vulnerability Type

Undefined Behavior / Unsoundness

Language

Rust / AArch64 Assembly

Root Cause

Incorrect Inline Assembly Register Constraints

Affected Component

keccak crate (armv8.rs)

Impact

Potential Memory Corruption / Logic Errors

Exploit Status

Theoretical / Compiler-Dependent

GHSA-3288-P39F-RQPV

Unknown0.04%

Rust Keccak: When 'Immutable' Inputs Go Rogue on ARMv8

Alon Barad

Software Engineer

Feb 19, 2026·6 min read·4 visits

No Known Exploit

Executive Summary (TL;DR)

A deep-dive analysis of a technical unsoundness in the Rust `keccak` crate's ARMv8 assembly backend. By misrepresenting register constraints to the LLVM compiler, the implementation created a divergence between the hardware state and the compiler's abstract model, leading to Undefined Behavior (UB) and potential memory corruption scenarios.

Attack Flow Diagram

The Hook: Speed Kills (Safety)

Cryptography is an eternal war between mathematical correctness and raw execution speed. In the Rust ecosystem, we pride ourselves on memory safety, borrowing rules, and the borrow checker's iron fist. But when you need to hash gigabytes of data per second, safe Rust sometimes isn't enough. You reach for the unsafe keyword, and occasionally, you drop directly into inline assembly (asm!).

This is exactly what the keccak crate (part of the RustCrypto organization) did. To squeeze every ounce of performance out of ARMv8 processors (like the one in your shiny MacBook or heavy-duty AWS Graviton instance), they implemented the Keccak-f[1600] permutation using hand-tuned assembly. It's a standard move: bypass the compiler's heuristics to utilize specific hardware instructions.

However, writing inline assembly in a high-level language is a handshake agreement with the compiler. You promise to tell the compiler exactly which registers you touch, read, or clobber. If you lie—even accidentally—the compiler's optimizer, which assumes you are a rational actor, will punish you. In this case, the keccak developers made a classic mistake: they modified 'input' registers behind the compiler's back.

The Flaw: Lying to LLVM

To understand this bug, you need to understand how LLVM (the backend for Rust) handles inline assembly constraints. When you define an asm! block, you classify operands. in("reg") tells LLVM: "I am reading from this register. I promise I will not change its value. If I do, I will restore it, or it doesn't matter because it's an input."

The vulnerability lies in src/armv8.rs. The assembly code utilized ARM64's post-indexed addressing mode. Look at this instruction pattern used in the vulnerable code:

st1.1d {v20-v23}, [x0], #32

In plain English, this instruction says: "Store the vector registers v20 through v23 into the memory address at x0, and then increment x0 by 32 bytes." That [x0], #32 syntax is the smoking gun. It is an auto-increment.

The hardware executes this. x0 changes. It physically holds a new memory address. But the Rust code wrapping this assembly defined x0 as an in constraint. The compiler, trusting the developer, assumes x0 is effectively immutable for the duration of that block (or at least, that its mutation is irrelevant to the output). This creates a split reality: the hardware sees x0 + 32, but the compiler's control flow graph believes x0 is still x0.

The Code: The Auto-Increment Trap

Let's look at the diff. It is a masterclass in how a few characters can mean the difference between 'secure' and 'undefined behavior'. The issue wasn't just x0 (state pointer); it also affected x1 (constants pointer) and x8 (loop counter).

The Vulnerable Code:

unsafe {
    asm!(
        // ... instructions omitted ...
        "st1.1d {v20-v23}, [x0], #32", // <--- HARDWARE MODIFIES x0
        "st1.1d {v24},     [x0]",
        
        // The Lie:
        in("x0") state.as_mut_ptr(), 
        in("x1") crate::RC[24-round_count..].as_ptr(),
        in("x8") round_count,
        // ...
    );
}

Because of in("x0"), if the compiler decides to use the value of state.as_mut_ptr() after this assembly block, it might just reload the original value it cached in a register, or assume the register still holds the start of the buffer. It has no idea the assembly code advanced the pointer.

The Fixed Code:

unsafe {
    asm!(
        // ... same instructions ...
        "st1.1d {v20-v23}, [x0], #32",
        "st1.1d {v24},     [x0]",
        
        // The Truth:
        inout("x0") state.as_mut_ptr() => _, 
        inout("x1") crate::RC[24-round_count..].as_ptr() => _,
        inout("x8") round_count => _,
        // ...
    );
}

The fix changes in to inout. Crucially, it adds => _. This syntax tells Rust: "I am taking this value in, I am modifying it, and the result is garbage/clobbered (_). Do not rely on the value of this register after this block executes." This forces the compiler to reload the pointer if it needs it again, rather than using a stale, corrupted register.

The Exploit: The Optimizer's Revenge

However, to an attacker or a researcher, this is a Time Bomb.

The Setup: The compiler sees x0 is in. It assumes x0 is preserved.
The Optimize: Instead of reloading the address of state for the subsequent write, the compiler emits instructions to write to [x0], assuming x0 still points to the start of the buffer.
The Reality: The assembly block incremented x0 by 32 bytes.
The Corruption: The write state[0] = 0 actually writes to state[4] (assuming 64-bit words).

The Fix: Coming Clean

The remediation is straightforward: stop lying to the compiler. The patch applied in RustCrypto/sponges PR #101 correctly identifies the registers as inout.

If you are a user of keccak (or crates that depend on it like sha3), you need to check if you have the asm feature enabled. It is off by default, which saves the vast majority of users. If you do use it for performance on ARMv8:

Upgrade: Ensure keccak is at version 0.1.6 or higher.
Audit: Run cargo tree | grep keccak to see which version you are pulling in.
Fallback: If you can't upgrade, disable the asm feature. The pure Rust implementation is slower but semantically correct and safe.

This incident serves as a reminder: unsafe in Rust transfers the responsibility of correctness from the compiler to the human. And humans are terrible at tracking invisible hardware side-effects like post-increment registers.

Official Patches

RustCryptoGitHub Pull Request #101 fixing the register constraints

Fix Analysis (1)

Technical Appendix

CVSS Score

Unknown/ 10

Unknown

EPSS Probability

0.04%

Top 100% most exploited

Affected Systems

Rust applications using `keccak` crate with `asm` feature enabledARMv8 (AArch64) architectures

Affected Versions Detail

Product	Affected Versions	Fixed Version
keccak RustCrypto	< 0.1.6	0.1.6

Attribute	Detail
Vulnerability Type	Undefined Behavior / Unsoundness
Language	Rust / AArch64 Assembly
Root Cause	Incorrect Inline Assembly Register Constraints
Affected Component	keccak crate (armv8.rs)
Impact	Potential Memory Corruption / Logic Errors
Exploit Status	Theoretical / Compiler-Dependent

MITRE ATT&CK Mapping

T1211Exploitation for Defense Evasion

Defense Evasion

T1190Exploit Public-Facing Application

Initial Access

CWE-658

Residue in Register (Inferred)

The software relies on the correctness of register constraints in inline assembly. Incorrect constraints lead to a mismatch between compiler assumptions and actual hardware state, resulting in undefined behavior.

Known Exploits & Detection

N/ANo public exploit exists; issue is theoretical unsoundness.

Vulnerability Timeline

Vulnerability reported to maintainers

2026-02-12

Fix committed to RustCrypto/sponges

2026-02-13

Advisory published and fixed version 0.1.6 released

2026-02-17