Jan 27, 2026·6 min read·60 visits
The `weights_only=True` flag in PyTorch was supposed to be the silver bullet against Pickle RCE. However, a logic flaw in the underlying C++ unpickler allows attackers to use the `SETITEM` opcode on non-container objects. This causes Type Confusion on the heap, allowing a malicious model file to corrupt memory and execute arbitrary code, even when the user explicitly requests 'safe' loading.
A critical heap corruption vulnerability in PyTorch's restricted unpickler allows attackers to bypass the `weights_only=True` security flag, turning safe model loading into arbitrary code execution.
For years, the Python security community has screamed one mantra until our throats were sore: Do not unpickle untrusted data. Pickle is not a serialization format; it is a stack-based virtual machine that executes instructions. Giving someone a pickle file is like handing them a loaded gun and hoping they don't pull the trigger.
PyTorch, realizing that the Machine Learning ecosystem is built entirely on people downloading random .pt files from Hugging Face, introduced weights_only=True. This flag was supposed to be the bouncer at the club. It uses a restricted unpickler that only allows a whitelist of classes (like torch.Tensor and basic primitives) and blocks the dangerous GLOBAL opcodes used to import os.system.
It was a beautiful dream. We thought we were safe. But CVE-2026-24747 is the wake-up call that reminds us: if you build a security boundary in C++, you better make sure you validate your pointers. This vulnerability proves that even a 'neutered' pickle machine can still be weaponized if the interpreter gets confused about what it's holding.
To understand this bug, you have to look at torch/csrc/jit/serialization/unpickler.cpp. The unpickler is a state machine that reads opcodes from a stream and manipulates a stack of IValue objects. When the unpickler encounters a SETITEM opcode, it assumes the developer is behaving normally—assigning a value to a dictionary.
Here is the logic flaw: The unpickler checked what was being put into the container, but it didn't rigorously check if the container was actually a container. It blindly trusted that if the pickle stream invoked SETITEM, the object currently sitting on top of the stack was a Dict or List.
> [!NOTE] > The Bug Class: This is a classic Type Confusion vulnerability (CWE-843). The software reads a piece of memory acting as Object A (e.g., a Tensor) but treats it as Object B (e.g., a Dictionary).
When the C++ code executes the set operation on a non-container object (like a SWALR scheduler instance or a raw Tensor), it calculates a memory offset where it thinks the dictionary buckets are. Since the object isn't a dictionary, that write operation lands somewhere else entirely—potentially overwriting a vtable, a function pointer, or distinct object metadata. It’s the memory equivalent of trying to file a document in a cabinet, but the cabinet is actually a wood chipper.
The vulnerability was exposed by an innocent bystander: the SWALR (Stochastic Weight Averaging Learning Rate) scheduler. The developers of this component made a classic mistake—they tried to serialize a python function (self.anneal_func) inside the checkpoint.
When weights_only=True attempted to load this, it usually just threw a fit. But specifically, the internal state representation of SWALR combined with the restricted unpickler created a scenario where valid objects were on the stack, but the SETITEM logic was invoked in a context the C++ engine wasn't prepared for.
Here is the patch that fixed the trigger in torch/optim/swa_utils.py (Commit 954dc5183ee9205cbe79876ad05dd2d9ae752139):
# BEFORE: Blindly pickling everything
# def state_dict(self):
# return {key: value for key, value in self.__dict__.items()}
# AFTER: Explicitly sanitizing the dangerous 'anneal_func'
def state_dict(self):
state = self.__dict__.copy()
# Remove the callable function from the state
state.pop('anneal_func', None)
# Store the strategy as a string instead
state['_anneal_strategy'] = self._anneal_strategy
return stateWhile this Python patch stops SWALR from accidentally triggering the bug, the real fix had to happen in the C++ unpickler to validate opcodes properly. If they only fixed the Python code, attackers could still manually craft a pickle stream to replicate the SWALR state configuration and trigger the crash.
To exploit this, we don't need a valid neural network. We need a handcrafted pickle stream. The goal is to confuse the unpickler into writing 8 bytes of our choosing to an arbitrary memory address relative to a heap object.
The Attack Chain:
torch.Tensor because weights_only=True allows it.Tensor onto the stack. This is our 'victim' object.SETITEM opcode.The unpickler looks at the stack. It sees [Tensor, Payload]. It executes SETITEM. The C++ code interprets the Tensor as a Dict. It attempts to hash the key (which might be missing or defaulted) and write the value.
If the attacker aligns the heap correctly (Heap Feng Shui), this out-of-bounds write can overwrite the Tensor's internal data pointer or its C++ vtable. Once we control the instruction pointer, we bypass the 'No-Code-Execution' promise of the restricted loader entirely.
This vulnerability is particularly nasty because it targets the specific mechanism designed to enable trust. Security teams often allow .pt files through firewalls or into air-gapped training environments under the condition that weights_only=True is enforced.
With CVE-2026-24747, that condition is moot. An attacker can upload a model to a public repository (like Hugging Face or CivitAI) that looks like a valid SafeTensors or PyTorch checkpoint. When a data scientist downloads it to fine-tune a model locally, the exploit fires immediately upon loading.
The impact ranges from crashing the training cluster (Denial of Service) to full Remote Code Execution (RCE) on the GPU cluster head node. Considering these nodes often have access to massive datasets and proprietary algorithms, the confidentiality loss is catastrophic.
The only viable mitigation is to upgrade PyTorch to version 2.10.0 or later. The fix involves adding strict type checking in unpickler.cpp before processing modification opcodes. If SETITEM is called, the engine must verify the target is actually a mutable container.
Immediate Actions:
pip install torch>=2.10.0weights_only=True enabled, do not assume existing files are safe.Remember: In the world of serialization, parsing is just coding with someone else's bugs.
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H| Product | Affected Versions | Fixed Version |
|---|---|---|
PyTorch Meta | < 2.10.0 | 2.10.0 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-843 (Type Confusion) |
| Attack Vector | Network / File (Context Dependent) |
| CVSS | 8.8 (Critical) |
| Impact | Remote Code Execution (RCE) |
| Trigger | Opcode 'SETITEM' on non-container |
| Component | torch/csrc/jit/serialization/unpickler.cpp |
The program accesses a resource using an incompatible type, which triggers a logical error because the resource does not have the expected properties.
An in-depth technical analysis of multiple security vulnerabilities in the self-hosted Docker API server of Crawl4AI up to version 0.8.7. These flaws include a critical arbitrary file write via symlink traversal and TOCTOU weakness, CRLF log injection, webhook header injection, and SSRF filter gaps. These have been remediated in version 0.8.8.
A technical evaluation of the Crawl4AI open-source web crawling and scraping library revealed a high-severity credential exfiltration vulnerability in its self-hosted Dockerized API server. The flaw arises from an unvalidated base_url parameter in request payloads and a dynamic prefix resolution mechanism that retrieves system environment variables. Unauthenticated remote attackers can leverage these features in tandem to extract host-level secrets or redirect configured LLM API keys to an external listener under their control.
The Crawl4AI Docker API server, in versions 0.8.6 and prior, contains multiple critical vulnerabilities including improper path sanitization, missing authentication on administration routes, hardcoded JWT secrets, and SSRF. These vulnerabilities allow remote, unauthenticated attackers to write arbitrary files, execute arbitrary code, and pivot into private cloud environments.
A local security vulnerability in the Nuxt development server (nuxt dev) allows local unprivileged users to access sensitive configuration files and source code. On Linux environments running Node.js 20+, Nuxt bound its internal vite-node IPC server to an abstract-namespace Unix socket without any peer authentication, enabling co-resident local users to connect and request module code directly.
Mozilla Bleach is an open-source HTML sanitizing library for Python. Versions up to and including 6.3.0 contain an incomplete filtering implementation in the URI validation logic ('sanitize_uri_value'). This logic fails to detect disallowed protocols, such as 'javascript:', if they contain Unicode invisible characters, whitespace characters, or characters with a code point greater than U+00A0. While standard-compliant web browsers do not directly execute invalid URI schemes containing these non-standard characters, downstream systems that normalize Unicode text by stripping invisible or non-ASCII characters can unintentionally reactivate the 'javascript:' prefix, causing Cross-Site Scripting (XSS). Additionally, this behavior violates Bleach's core sanitization contract by outputting URIs that bypass protocol allowlists configured by the caller.
An uncontrolled resource consumption vulnerability exists in the Python package Bleach when parsing text to linkify email addresses. When `parse_email=True` is enabled, the regular expression engine is forced into a quadratic-time complexity scan on specially crafted payloads lacking an '@' symbol. This causes immediate CPU exhaustion and blocks application server worker processes.