CVE-2026-24116: The Greedy Fetch - Crashing Wasmtime with AVX Optimizations
Jan 27, 2026·7 min read·2 visits
Executive Summary (TL;DR)
The Cranelift compiler got too aggressive with AVX optimizations on x86-64. It implemented scalar floating-point `copysign` operations using 128-bit vector instructions without realizing those instructions always fetch 16 bytes from memory. If a guest places a float at the edge of a memory page, the host CPU over-reads into the guard page, crashing the runtime.
A critical code generation flaw in Wasmtime's Cranelift compiler allows malicious WebAssembly modules to trigger host-level segmentation faults. By exploiting AVX instruction folding optimizations, an attacker can force the CPU to read out-of-bounds memory, leading to immediate Denial of Service.
The Hook: When Optimizations Bite Back
We love compilers. They take our garbage code and turn it into highly tuned machine instructions. But sometimes, they get a little too clever for their own good. In the world of WebAssembly (Wasm), the promise is simple: speed and safety. You run untrusted code at near-native speeds because the runtime (like Wasmtime) guarantees sandboxing. But what happens when the compiler itself breaks the rules of physics—or in this case, the rules of memory mapping?
Enter CVE-2026-24116. This isn't a logic error in your application; it's a fundamental flaw in how Wasmtime's backend compiler, Cranelift, speaks x86-64. Specifically, it involves the handling of floating-point numbers when Advanced Vector Extensions (AVX) are enabled.
Here is the setup: You have a secure host running Wasmtime. You allow users to upload Wasm modules (think serverless functions, plugins, or edge computing). You assume the worst thing they can do is an infinite loop. But thanks to this bug, a malicious user can craft a specific sequence of floating-point math that forces the host process to commit suicide via segmentation fault (SIGSEGV). It’s a classic case of the "abstraction leak"—quite literally leaking memory reads beyond the sandbox boundaries.
The Flaw: The 128-bit Hammer Problem
To understand this bug, you need to understand how x86-64 handles floating-point math. WebAssembly has an instruction called f64.copysign (and f32.copysign). It takes the magnitude of one number and applies the sign bit of another. It’s a simple bitwise operation.
However, x86-64 doesn't have a native scalar instruction to do bitwise operations on floating-point registers. To implement copysign, the compiler has to get creative using SIMD (Single Instruction, Multiple Data) instructions. It uses a sequence of logical ANDs and ORs (VANDPD, VANDNPD, VORPD) to stitch the sign bit onto the magnitude.
Here is the catch: these AVX instructions operate on 128-bit (16-byte) vectors. But a double-precision float (f64) is only 64 bits (8 bytes), and a single-precision float (f32) is just 32 bits (4 bytes).
Cranelift has an optimization pass called "load sinking." If an instruction needs a value from memory, the compiler tries to fold that load directly into the instruction to save a cycle. Instead of Load X -> Math X, it generates Math [Memory].
The fatal flaw was that Cranelift allowed this optimization for copysign. It generated code like VANDPD xmm0, xmm1, [address]. The CPU sees a VANDPD with a memory operand and says, "I am an AVX instruction; I shall fetch 128 bits (16 bytes)." It does not care that the Wasm logic only cared about the first 64 bits. It grabs the whole chunk. This is the "Greedy Fetch."
The Code: Instruction Selection Roulette
Let's look at the Instruction Selection Lowering Engine (ISLE) rules. This is the domain-specific language Cranelift uses to map intermediate representation (IR) to machine code.
The Vulnerable Logic:
Prior to the fix, the compiler was allowed to match an fcopysign operation where the operands were memory loads (ValueType::Mem). The generated assembly looked something like this for an f64 operation:
; Target: x86-64 with AVX
; logical 'and-not' packed double
; xmm1 contains the sign bit mask
; [rax] points to our 8-byte float
vpandnpd xmm0, xmm1, [rax] ; <--- DANGER ZONEEven though [rax] points to an 8-byte value, vpandnpd reads 16 bytes. It reads the 8 bytes we want, plus the following 8 bytes that we didn't ask for.
The Fix (Commit 799585f): The patch forces the operands into registers before performing the bitwise operations. This explicitly disables load-sinking for these specific patterns.
;; New rule structure in Cranelift
(rule (lower (has_type $F64 (fcopysign a @ (value_type $F64) b)))
(let ((sign_bit Xmm (imm $F64 0x8000000000000000))
(a Xmm a) ;; Force 'a' into a register
(b Xmm b)) ;; Force 'b' into a register
(x64_orpd ... )))By forcing the value into a register first, the compiler is compelled to emit a move instruction appropriate for the data size (e.g., vmovsd for doubles), which correctly reads only 8 bytes. The subsequent bitwise math then happens safely inside the 128-bit registers using garbage data in the upper bits (which is harmlessly ignored).
The Exploit: Living on the Edge
How do we weaponize this? We need to construct a memory layout where reading "too much" is fatal. In Wasmtime (and most systems), reading garbage data is usually fine—unless you read memory that doesn't exist.
WebAssembly linear memory is typically surrounded by guard pages—unmapped regions of virtual memory designed to catch out-of-bounds accesses. If you touch a guard page, the OS sends a SIGSEGV.
The Attack Chain:
- Allocate Linear Memory: Create a Wasm module that allocates a chunk of memory (pages).
- Position the Payload: We calculate the exact address of the end of our allocated memory. Let's say our memory ends at address
0x10000. - The Trap: We place an
f64value at0x0FFF8(exactly 8 bytes before the end). - The Trigger: We execute
f64.copysignusing that value as an operand.
The Crash:
The compiled code executes VPANDNPD xmm0, xmm1, [0x0FFF8].
The CPU attempts to read 16 bytes starting at 0x0FFF8.
- Bytes 0-7 are valid (inside our memory).
- Bytes 8-15 fall into
0x10000+, which is the unmapped guard page. - SIGSEGV. The host process crashes instantly.
> [!NOTE] > While Wasmtime has signal handlers to catch traps, a raw SIGSEGV from an instruction that the compiler assumes is "safe" (because it passed bounds checks) can bypass expected handling logic or simply terminate the process depending on the embedding configuration. In many serverless environments, this hard crash brings down the worker node.
The Impact: Why Should We Panic?
If you are running a single-tenant CLI tool, this is annoying. If you are a cloud provider or running a multi-tenant SaaS platform executing user-submitted plugins (like Shopify Functions, Cloudflare Workers, etc.), this is a Critical Availability Issue.
Denial of Service (DoS): A single malicious tenant can repeatedly crash the worker nodes hosting the runtime. This requires zero privileges and no authentication bypass—just valid (but malicious) Wasm code. If your orchestration layer restarts the pod, the attacker just runs it again, creating a boot-loop.
Theoretical Info Leak:
There is a subtle secondary risk. When the CPU reads those extra 8 bytes into the XMM register, that data exists in the register file. While Wasm copysign results are truncated to their scalar size when stored back to memory, a sophisticated attacker might find a gadget or a secondary compiler bug that allows them to observe the high bits of the XMM register. If the adjacent memory wasn't a guard page but rather another tenant's data, this could turn into a buffer over-read vulnerability. However, the immediate and guaranteed impact is the crash.
The Fix: Restraining Order
The remediation is straightforward: Update Wasmtime. The Bytecode Alliance responded quickly, releasing patches that modify the ISLE codegen rules to prevent this unsafe optimization.
Patched Versions:
- 41.0.1
- 40.0.3
- 36.0.5
If you cannot update immediately, you have two options, neither of which is great:
- Disable AVX: Recompile or configure Wasmtime to target a baseline x86-64 architecture without AVX. This will revert to legacy SSE instructions which do not exhibit this specific behavior (or use different lowering rules), but your performance on math-heavy workloads will tank.
- Pray: Hope your users are nice. (Not recommended).
For developers maintaining their own Cranelift integrations, ensure you pull the latest cranelift-codegen crate. The fix is entirely within the compiler backend logic, so no changes to guest Wasm modules are required.
Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:3.1/AV:L/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:HAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
Wasmtime Bytecode Alliance | = 41.0.0 | 41.0.1 |
Wasmtime Bytecode Alliance | >= 40.0.0 < 40.0.3 | 40.0.3 |
Wasmtime Bytecode Alliance | <= 39.x.x | 36.0.5 |
| Attribute | Detail |
|---|---|
| CWE | CWE-125 (Out-of-bounds Read) |
| Attack Vector | Local (Guest Wasm Module) |
| CVSS | 7.5 (High) |
| Architecture | x86-64 + AVX |
| Impact | Denial of Service (Host Crash) |
| Component | Cranelift ISLE |
MITRE ATT&CK Mapping
The software reads data past the end, or before the beginning, of the intended buffer.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.