Feb 20, 2026·7 min read·6 visits
The `ml-dsa` crate used standard hardware division instructions (`/`) on secret data. Because CPU division takes variable time depending on the inputs, an attacker can time the signing process to reconstruct the private key. Fixed in 0.1.0-rc.2 via Barrett reduction.
A classic timing side-channel vulnerability was discovered in the RustCrypto implementation of ML-DSA (Module-Lattice-Based Digital Signature Algorithm), specifically within the `ml-dsa` crate. Despite being a 'Post-Quantum' algorithm designed to withstand future threats, the implementation fell victim to a very old-school problem: non-constant time integer division. By measuring the precise execution time of the signing operation, an attacker on an adjacent network can statistically recover the private key, rendering the quantum-resistance moot.
We love Rust. It saves us from buffer overflows, use-after-frees, and the general existential dread of C++. But here is the uncomfortable truth: rustc is a compiler, not a magic wand. It cannot save you from algorithmic incompetence, and it certainly cannot save you from the physics of silicon.
CVE-2026-22705 is a beautiful, if tragic, example of this. The target is the ml-dsa crate, part of the RustCrypto project. This crate implements the Module-Lattice-Based Digital Signature Algorithm (ML-DSA), effectively the NIST standard for Post-Quantum Cryptography (PQC). The entire point of this library is to secure data against the supercomputers of the future.
However, while the math might be robust enough to stop a quantum computer, the implementation wasn't robust enough to stop a guy with a stopwatch. By utilizing standard integer division on secret data, the library leaked internal state through time itself. It is the cryptographic equivalent of buying a bank vault made of titanium but forgetting to oil the hinges, allowing a thief to hear exactly when the tumblers align.
To understand this bug, we have to look at how computers do math. Addition, subtraction, and bitwise operations are generally "constant time"—they take the same number of CPU cycles regardless of whether you are adding 1 + 1 or 4,000,000 + 4,000,000. This is crucial for cryptography because if an operation takes longer based on the value of the data, and that data is secret, you are leaking the secret.
Division, however, is the problem child of the ALU (Arithmetic Logic Unit). On most modern architectures (x86_64, ARM64), the hardware instruction for division (DIV or IDIV) is iterative. It effectively performs a series of subtractions and shifts. The time it takes to complete depends heavily on the magnitude of the operands. Dividing a large number by a small number takes a different amount of time than dividing two large numbers.
The vulnerability in ml-dsa resides in the decompose function (Algorithm 36 in the spec). The code was performing a reduction using the / operator on a value derived from the secret signing key (s1).
> [!WARNING] > The Cardinal Rule of Crypto Engineering: Never, ever use variable-time instructions (division, modulo, conditional branches) on secret data.
Because the dividend depended on the secret key, the execution time of the signing operation fluctuated based on the key's bits. This allows for a Timing Side-Channel attack. It doesn't matter how complex the lattice math is; if the CPU yells "I'm busy!" for 50 cycles on a 0 bit and 70 cycles on a 1 bit, the game is over.
Let's look at the smoking gun. In the vulnerable versions of the ml-dsa crate, specifically inside the logic handling the Number Theoretic Transform (NTT) decomposition, the developers trusted the standard library too much.
Here is the vulnerable logic. It looks innocent enough to a standard developer, but to a cryptographer, it screams danger:
// The Vulnerable Code
// r1 is derived from secret state
// TwoGamma2::U32 is a constant
// The compiler emits a variable-latency DIV instruction here
r1.0 /= TwoGamma2::U32;The fix involves removing the hardware division entirely and replacing it with Barrett Reduction. Barrett Reduction is a method of calculating x / n and x % n using only multiplication, addition, and bit shifts—operations that are constant-time on modern CPUs. The patch introduces a trait ConstantTimeDiv and precomputes the necessary multipliers.
// The Fixed Code (Commit 035d9ee)
// Instead of /, we use a helper trait that implements Barrett Reduction
let r1 = Elem::new(TwoGamma2::ct_div(diff.0));
// Behind the scenes, ct_div does something like:
// (value * PRECOMPUTED_MULTIPLIER) >> SHIFT_AMOUNTAdditionally, the patch fixed the ntt and ntt_inverse functions. Originally, they used step_by in loops, which the compiler optimized into variable-time logic. The fix replaced these with const generic loops (ntt_layer<const LEN, ...>), forcing the compiler to unroll or structure the loops in a predictable, constant-time manner.
So, how do we actually weaponize this? We don't need to break the lattice math. We just need a high-resolution clock and access to the signing oracle.
ml-dsa (e.g., a TLS handshake or a JWT signing service).Adjacent access (CVSS AV:A).rdtsc (Read Time-Stamp Counter) for nanosecond precision.We employ Differential Timing Analysis. We formulate a hypothesis: "If the first bit of the secret key is 1, the division takes X cycles on average. If it is 0, it takes Y cycles."
We partition our millions of timing samples into two buckets based on the input data we controlled. If the difference in average time between the buckets correlates with our model, we have guessed the bit correctly. We rinse and repeat for the subsequent bits until we have recovered the private key component s1.
Why is this a big deal? Isn't this just a local attack?
Yes and no. The CVSS score is 6.4 (Medium) because of the complexity and proximity requirements. You can't exploit this over the open internet easily because the jitter of the internet is larger than the timing difference of a division instruction.
However, in modern infrastructure, "Adjacent" is broader than you think. Multi-tenant cloud environments, Kubernetes clusters, and trusted execution environments (TEEs) often place attackers on the same physical hardware as the victim. If a malicious tenant can time the execution of a cryptographic service running on a neighboring core, they can steal the Post-Quantum Private Key.
Once the key is stolen, the attacker can sign arbitrary data. They can forge identities, issue fake certificates, or authorize malicious transactions. The irony is palpable: an organization adopts ML-DSA to be safe for the next 50 years, only to be compromised today by a side-channel known since the 1990s.
The remediation is straightforward: Update immediately.
If you are using the ml-dsa crate, ensure you are on version 0.1.0-rc.2 or higher. The RustCrypto team was quick to respond, merging the fix in commit 035d9eef.
If you are writing cryptographic code in Rust (or any language):
grep your codebase for / and %. If the operands are secret, you have a vulnerability.subtle: The Rust subtle crate provides traits for constant-time comparison and arithmetic. Use them.cargo-asm) for hot paths to ensure no branch instructions or variable-latency opcodes were inserted by the optimizer.CVSS:3.1/AV:A/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:N| Product | Affected Versions | Fixed Version |
|---|---|---|
ml-dsa RustCrypto | < 0.1.0-rc.2 | 0.1.0-rc.2 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-1240 (Use of a Cryptographic Primitive with a Risky Implementation) |
| CVSS v3.1 | 6.4 (Medium) |
| Attack Vector | Adjacent Network |
| Attack Complexity | High (Requires statistical analysis) |
| Privileges Required | Low |
| Exploit Status | PoC / Academic |
The software uses a cryptographic primitive that has a risky implementation, such as a timing side-channel, which renders the encryption breakable.