Mar 19, 2026·5 min read·3 visits
A path traversal vulnerability in PyMuPDF's CLI allows arbitrary file writes when extracting embedded files from untrusted PDFs.
CVE-2026-3029 is a high-severity path traversal vulnerability in the PyMuPDF library, specifically within the CLI extraction utility. The flaw allows an attacker to craft a malicious PDF that, when processed without an explicit output directory, writes embedded files to arbitrary locations on the host filesystem.
CVE-2026-3029 is a path traversal and arbitrary file write vulnerability affecting the PyMuPDF Python library. The flaw resides within the command-line interface utility, specifically the pymupdf embed-extract command implemented in the src/__main__.py module.
The root cause involves the CLI tool blindly trusting the filename metadata stored within an embedded file dictionary of a PDF document when an explicit output path is omitted. This metadata is entirely user-controlled and is not sanitized before being passed to underlying filesystem write operations.
An attacker exploits this vulnerability by crafting a malicious PDF document with embedded files containing directory traversal sequences in their filenames. When a victim processes this document using the extraction command, the application resolves the path traversal sequences and writes the embedded payload to arbitrary locations on the host filesystem.
The vulnerability is classified as CWE-22 (Improper Limitation of a Pathname to a Restricted Directory). The core issue manifests in the embedded_get function, which handles the extraction of embedded file streams from PDF documents.
When the pymupdf embed-extract command is executed without the -output argument, the script falls back to using the embedded file's internal name. The application retrieves this value directly from the PDF dictionary via d["filename"] and uses it as the destination path for the extracted data stream.
Because the PDF format allows arbitrary strings in the filename field of an embedded file stream, an attacker can insert relative path sequences such as ../../. The Python open() function processes these traversal sequences natively, directing the file write operation outside the intended working directory.
Analyzing the vulnerable code in src/__main__.py reveals the exact mechanism of the flaw. Prior to version 1.26.7, the application constructed the output filename using a simple conditional statement without validating the resulting path.
filename = args.output if args.output else d["filename"]
with open(filename, "wb") as output:
output.write(stream)The patch introduced in commit 603cafe38a183b8bab34f16d05043b4185d8d40a implements strict path validation logic. When an explicit output path is omitted, the application now normalizes the target path using os.path.abspath(filename) and enforces directory pinning.
filename = args.output if args.output else d["filename"]
if not args.unsafe and not args.output:
if os.path.exists(filename):
sys.exit(f'refusing to overwrite existing file with stored name: {filename}')
filename_abs = os.path.abspath(filename)
if not filename_abs.startswith(os.getcwd() + os.sep):
sys.exit(f'refusing to write stored name outside current directory: {filename}')This validation ensures the resolved absolute path remains within the current working directory tree. It also implements an existence check to prevent overwriting existing files, closing the arbitrary file write vector.
Exploiting CVE-2026-3029 requires generating a PDF file with a specifically crafted embedded file object. The attacker modifies the filename or ufilename attributes of the embedded file dictionary to include traversal sequences leading to a target destination path.
The official repository contains a proof-of-concept demonstrating this methodology. An attacker uses the PyMuPDF API to append a new embedded file containing a malicious payload, explicitly setting the filename to a location outside the current directory.
import pymupdf
with pymupdf.open() as document:
document.new_page()
document.embfile_add(
'evil_entry',
b'malicious payload data\n',
filename="../../target_file.txt",
ufilename="../../target_file.txt",
desc="poc",
)
document.save("poc.pdf")The exploit chain completes when a victim executes python -m pymupdf embed-extract poc.pdf -name evil_entry on the command line. The application extracts the payload and writes it to the location dictated by the traversal sequence, relative to the victim's current working directory.
The primary impact of this vulnerability is arbitrary file write on the host filesystem running the PyMuPDF CLI utility. This capability allows an attacker to manipulate system state, modify configuration files, or drop executable payloads.
A successful exploit grants the attacker the same filesystem permissions as the user executing the pymupdf command. If the utility is run with administrative or root privileges, the attacker gains the ability to overwrite critical system binaries or authentication files such as ~/.ssh/authorized_keys.
The CVSS v3.1 base score is 7.8 (High), reflecting the severe impact on confidentiality, integrity, and availability. The attack vector is local, meaning the attacker relies on user interaction to process the malicious document rather than exploiting a network service directly.
The primary remediation strategy is upgrading the PyMuPDF library to version 1.26.7 or later. The patched versions enforce strict directory boundaries and implement safe default behaviors for the extraction CLI.
For environments where immediate upgrading is not feasible, users must avoid running the pymupdf embed-extract command on untrusted PDF documents. If extraction is strictly necessary, users must explicitly define the destination path using the -output argument.
Providing the -output argument bypasses the vulnerable logic path entirely by discarding the embedded filename metadata. Additionally, users operating version 1.26.7 or later must refrain from using the newly introduced -unsafe flag when processing files from unverified sources.
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H| Product | Affected Versions | Fixed Version |
|---|---|---|
PyMuPDF Artifex Software Inc. | <= 1.26.5 | 1.26.7 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-22 |
| Attack Vector | Local |
| CVSS Score | 7.8 |
| Impact | Arbitrary File Write |
| Exploit Status | Proof-of-Concept Available |
| KEV Status | Not Listed |
Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')