CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



CVE-2026-3029

CVE-2026-3029: Arbitrary File Write via Path Traversal in PyMuPDF CLI

Alon Barad
Alon Barad
Software Engineer

Mar 19, 2026·5 min read·46 visits

Executive Summary (TL;DR)

A path traversal vulnerability in PyMuPDF's CLI allows arbitrary file writes when extracting embedded files from untrusted PDFs.

CVE-2026-3029 is a high-severity path traversal vulnerability in the PyMuPDF library, specifically within the CLI extraction utility. The flaw allows an attacker to craft a malicious PDF that, when processed without an explicit output directory, writes embedded files to arbitrary locations on the host filesystem.

Vulnerability Overview

CVE-2026-3029 is a path traversal and arbitrary file write vulnerability affecting the PyMuPDF Python library. The flaw resides within the command-line interface utility, specifically the pymupdf embed-extract command implemented in the src/__main__.py module.

The root cause involves the CLI tool blindly trusting the filename metadata stored within an embedded file dictionary of a PDF document when an explicit output path is omitted. This metadata is entirely user-controlled and is not sanitized before being passed to underlying filesystem write operations.

An attacker exploits this vulnerability by crafting a malicious PDF document with embedded files containing directory traversal sequences in their filenames. When a victim processes this document using the extraction command, the application resolves the path traversal sequences and writes the embedded payload to arbitrary locations on the host filesystem.

Root Cause Analysis

The vulnerability is classified as CWE-22 (Improper Limitation of a Pathname to a Restricted Directory). The core issue manifests in the embedded_get function, which handles the extraction of embedded file streams from PDF documents.

When the pymupdf embed-extract command is executed without the -output argument, the script falls back to using the embedded file's internal name. The application retrieves this value directly from the PDF dictionary via d["filename"] and uses it as the destination path for the extracted data stream.

Because the PDF format allows arbitrary strings in the filename field of an embedded file stream, an attacker can insert relative path sequences such as ../../. The Python open() function processes these traversal sequences natively, directing the file write operation outside the intended working directory.

Code Analysis

Analyzing the vulnerable code in src/__main__.py reveals the exact mechanism of the flaw. Prior to version 1.26.7, the application constructed the output filename using a simple conditional statement without validating the resulting path.

filename = args.output if args.output else d["filename"]
with open(filename, "wb") as output:
    output.write(stream)

The patch introduced in commit 603cafe38a183b8bab34f16d05043b4185d8d40a implements strict path validation logic. When an explicit output path is omitted, the application now normalizes the target path using os.path.abspath(filename) and enforces directory pinning.

filename = args.output if args.output else d["filename"]
if not args.unsafe and not args.output:
    if os.path.exists(filename):
        sys.exit(f'refusing to overwrite existing file with stored name: {filename}')
    filename_abs = os.path.abspath(filename)
    if not filename_abs.startswith(os.getcwd() + os.sep):
        sys.exit(f'refusing to write stored name outside current directory: {filename}')

This validation ensures the resolved absolute path remains within the current working directory tree. It also implements an existence check to prevent overwriting existing files, closing the arbitrary file write vector.

Exploitation Methodology

Exploiting CVE-2026-3029 requires generating a PDF file with a specifically crafted embedded file object. The attacker modifies the filename or ufilename attributes of the embedded file dictionary to include traversal sequences leading to a target destination path.

The official repository contains a proof-of-concept demonstrating this methodology. An attacker uses the PyMuPDF API to append a new embedded file containing a malicious payload, explicitly setting the filename to a location outside the current directory.

import pymupdf
 
with pymupdf.open() as document:
    document.new_page() 
    document.embfile_add(
            'evil_entry',
            b'malicious payload data\n',
            filename="../../target_file.txt",
            ufilename="../../target_file.txt",
            desc="poc",
            )
    document.save("poc.pdf")

The exploit chain completes when a victim executes python -m pymupdf embed-extract poc.pdf -name evil_entry on the command line. The application extracts the payload and writes it to the location dictated by the traversal sequence, relative to the victim's current working directory.

Impact Assessment

The primary impact of this vulnerability is arbitrary file write on the host filesystem running the PyMuPDF CLI utility. This capability allows an attacker to manipulate system state, modify configuration files, or drop executable payloads.

A successful exploit grants the attacker the same filesystem permissions as the user executing the pymupdf command. If the utility is run with administrative or root privileges, the attacker gains the ability to overwrite critical system binaries or authentication files such as ~/.ssh/authorized_keys.

The CVSS v3.1 base score is 7.8 (High), reflecting the severe impact on confidentiality, integrity, and availability. The attack vector is local, meaning the attacker relies on user interaction to process the malicious document rather than exploiting a network service directly.

Remediation and Mitigation

The primary remediation strategy is upgrading the PyMuPDF library to version 1.26.7 or later. The patched versions enforce strict directory boundaries and implement safe default behaviors for the extraction CLI.

For environments where immediate upgrading is not feasible, users must avoid running the pymupdf embed-extract command on untrusted PDF documents. If extraction is strictly necessary, users must explicitly define the destination path using the -output argument.

Providing the -output argument bypasses the vulnerable logic path entirely by discarding the embedded filename metadata. Additionally, users operating version 1.26.7 or later must refrain from using the newly introduced -unsafe flag when processing files from unverified sources.

Official Patches

PyMuPDFPatch commit restricting arbitrary file extraction.

Fix Analysis (1)

Technical Appendix

CVSS Score
7.8/ 10
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Affected Systems

PyMuPDF CLI Tool

Affected Versions Detail

Product
Affected Versions
Fixed Version
PyMuPDF
Artifex Software Inc.
<= 1.26.51.26.7
AttributeDetail
CWE IDCWE-22
Attack VectorLocal
CVSS Score7.8
ImpactArbitrary File Write
Exploit StatusProof-of-Concept Available
KEV StatusNot Listed

MITRE ATT&CK Mapping

T1083File and Directory Discovery
Discovery
T1565.001Stored Data Manipulation
Impact
T1213Data from Information Repositories
Collection
CWE-22
Path Traversal

Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')

Known Exploits & Detection

GitHubTest suite PoC within the PyMuPDF repository (tests/test_4767.py)

Vulnerability Timeline

Vulnerability reported to Artifex Software Inc.
2025-11-19
Fix committed to PyMuPDF repository
2025-11-21
CERT/CC Vulnerability Note VU#504749 released
2026-02-12
CVE-2026-3029 officially published in the NVD
2026-03-19

References & Sources

  • [1]NVD Record
  • [2]CVE.org Record
  • [3]CERT/CC Vulnerability Note
  • [4]PyMuPDF Project

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.

More Reports

•about 21 hours ago•GHSA-H5X8-XP6M-X6Q4
7.1

GHSA-H5X8-XP6M-X6Q4: Unvalidated Signature Generation in @jhb.software/payload-cloudinary-plugin

The @jhb.software/payload-cloudinary-plugin exposes an endpoint that performs unvalidated cryptographic signing of Cloudinary API parameters, allowing authenticated users with minimal privileges to forge valid signatures for arbitrary actions. This flaw allows attackers to overwrite remote storage assets, execute unauthorized file uploads, alter asset visibility parameters, trigger SSRF webhooks, and perform directory traversal within Cloudinary repositories.

Alon Barad
Alon Barad
3 views•6 min read
•about 22 hours ago•GHSA-G2GW-Q38M-VJFC
8.7

GHSA-G2GW-Q38M-VJFC: Server-Side Request Forgery and Bearer Token Exfiltration in @merill/lokka

A Server-Side Request Forgery (SSRF) and Bearer Token Exfiltration vulnerability exists in the @merill/lokka (Lokka) Model Context Protocol (MCP) server prior to version 2.1.2. The server constructed Azure Resource Manager request URLs by concatenating user-controlled path parameters directly into destination request strings. By injecting authority-redefinition characters, an attacker can manipulate URL parsing to execute a host-escape attack, forcing the server to send high-privilege Azure Resource Manager (ARM) Bearer tokens to an external attacker-controlled host. This allows complete administrative access to the associated Azure subscriptions.

Alon Barad
Alon Barad
6 views•7 min read
•about 23 hours ago•GHSA-4XGF-CPJX-PC3J
5.3

GHSA-4xgf-cpjx-pc3j: Directory Traversal and Symlink Following in Pydantic Settings

A directory traversal and symlink following vulnerability exists in Pydantic Settings when using the NestedSecretsSettingsSource with nested subdirectory lookups enabled. An attacker capable of writing to the secrets directory can bypass size limitations, read arbitrary host files, or cause a denial-of-service condition via cyclic symlinks.

Amit Schendel
Amit Schendel
2 views•7 min read
•1 day ago•GHSA-H5RG-8P7F-47G2
4.1

GHSA-h5rg-8p7f-47g2: Server-Side Request Forgery (SSRF) in SurrealDB Identity & Access Management (IAM) JWKS Fetcher

A Server-Side Request Forgery (SSRF) vulnerability exists in SurrealDB's Identity & Access Management (IAM) module prior to version 3.1.5. When configuring JSON Web Key Set (JWKS) URLs for token verification, the remote fetcher follows HTTP redirects by default without validating redirect targets against configured network capabilities. This allows high-privileged users to bypass network access limits and perform blind port scanning of internal network resources.

Amit Schendel
Amit Schendel
4 views•6 min read
•1 day ago•GHSA-CC8F-FCX3-GPJR
7.7

GHSA-cc8f-fcx3-gpjr: Arbitrary File Disclosure via DEFINE ANALYZER mapper filter in SurrealDB

A local file disclosure vulnerability exists in SurrealDB's full-text search capabilities, allowing authenticated users with database EDITOR or OWNER roles to read arbitrary files from the host system filesystem. This occurs by abusing the mapper() filter inside a DEFINE ANALYZER statement to point to system files.

Alon Barad
Alon Barad
6 views•6 min read
•1 day ago•GHSA-H4H3-3RFJ-X6FQ
4.3

GHSA-H4H3-3RFJ-X6FQ: Value-Ordering Oracle Side-Channel via Indexed ORDER BY in SurrealDB

SurrealDB versions 3.0.0 through 3.1.4 contain an information exposure vulnerability (CWE-203) where the query planner optimizes sorted queries using indexes on fields with field-level SELECT restrictions. Because the query planner performs index-based sorting before enforcing permission-based redaction, unauthorized users can observe the physical order of returned rows to deduce the relative values of protected fields.

Alon Barad
Alon Barad
4 views•8 min read