The Never-Ending PDF: Crashing pypdf with a Lie
Jan 11, 2026·5 min read
Executive Summary (TL;DR)
Attackers can cause a Denial of Service (DoS) by providing a PDF with a missing `/Root` and a massive `/Size` value. `pypdf` (pre-6.6.0) attempts to iterate through the fictitious object count to 'recover' the file, leading to 100% CPU usage and application hangs.
A resource exhaustion vulnerability in the popular `pypdf` library allows attackers to hang applications via a maliciously crafted PDF. By manipulating the trailer's `/Size` parameter and omitting the `/Root` object, attackers trigger an infinite recovery loop.
The Hook: The Good Samaritan Anti-Pattern
In the world of file parsing, tolerance is usually a virtue. We want our libraries to handle slightly broken files without crashing. pypdf, the ubiquitous Python library for PDF manipulation, takes this philosophy to heart. It defaults to a 'non-strict' mode where it tries to be the hero, repairing malformed documents so your application doesn't have to deal with errors.
But in security, 'trying to be helpful' is often synonymous with 'opening a massive attack surface.' CVE-2026-22690 is a classic example of logic that works perfectly for accidental corruption but fails spectacularly against intentional malice. By trying to recover a missing piece of data, the library walks right into a trap laid by the attacker, turning a 2KB file into an infinite resource sink.
The Flaw: A Wild Goose Chase
To understand this bug, you need to know a tiny bit about PDF structure. Every PDF has a trailer dictionary that acts as the map to the file. Crucially, it contains a /Root key pointing to the Document Catalog. If this /Root is missing, the file is technically garbage.
However, pypdf thinks, "No problem, I'll find it!" It decides to scan the file for the catalog manually. But how many objects should it check? It looks at the /Size entry in the trailer, which is supposed to represent the total number of objects in the file.
Here lies the logic flaw: The /Size entry is just a number written in text. It is completely attacker-controlled. pypdf doesn't validate if the file is actually large enough to contain that many objects. It simply trusts the number. If the attacker sets /Size to 2 billion, pypdf dutifully prepares to check 2 billion objects, one by one.
The Code: The Loop of Death
Let's dissect the crime scene in pypdf/_reader.py. In versions prior to 6.6.0, the recovery logic was a simple linear loop bounded only by the untrusted input.
# The Vulnerable Logic (Pre-patch)
# nb is directly taken from the PDF trailer without validation
nb = cast(int, self.trailer.get("/Size", 0))
# The Loop of Death
# If nb is 2,147,483,647, this loop runs effectively forever
for i in range(nb):
try:
# get_object involves disk seek and parsing operations
# Doing this billions of times consumes massive CPU
obj = self.get_object(i + 1)
if isinstance(obj, DictionaryObject) and obj.get("/Type") == "/Catalog":
self._validated_root = obj
break
except Exception:
passThe fix, introduced in commit 294165726b646bb7799be1cc787f593f2fdbcf45, is basically a sanity check. The maintainers introduced a hard limit (root_object_recovery_limit) to stop the madness if the object isn't found quickly.
# The Fix
for i in range(number_of_objects):
# The Safety Valve
if i >= self._root_object_recovery_limit: # Defaults to 10,000
raise LimitReachedError("Maximum Root object recovery limit reached.")
# ... continue search ...The Exploit: Crafting the PDF Bomb
You don't need complex fuzzers to exploit this. You just need a text editor. The goal is to trick the library into entering the recovery path (by deleting /Root) and then giving it a massive search space (by inflating /Size).
Here is the recipe for a "PDF Bomb":
- Create a minimal PDF.
- Delete the
/Rootentry from the trailer. - Set
/Sizeto a massive integer (e.g.,2147483647).
The Payload:
%PDF-1.7
trailer
<<
/Size 2147483647 % The Lie: "I have 2 billion objects"
% /Root ... % The Missing Map: Triggers recovery mode
>>
%%EOFWhen a vulnerable Python application runs PdfReader("bomb.pdf"), it parses the trailer, notices the missing root, reads the fake size, and effectively hangs. The process will consume 100% of a CPU core indefinitely (or until the OS kills it).
The Impact: Why Should We Panic?
"But it's just a DoS!" you might say. "It's not RCE!" While true, this vulnerability is a nightmare for automated document processing pipelines. Think about a resume ingestion service, an expense report scanner, or an email attachment analyzer.
If an attacker uploads just a handful of these 2KB files, they can exhaust all worker threads in your processing cluster. The server isn't technically crashing—it's just very, very busy counting to infinity. Health checks might even pass because the HTTP server is up, but the background workers are zombies.
The CVSS score is a surprisingly low 2.7, mostly because it's considered a local availability impact. But don't let the score fool you; if your business relies on processing PDFs, this is a Severity: High operational risk.
The Fix: Stopping the Bleeding
The remediation is straightforward: Update pypdf to version 6.6.0 or higher. This version introduces the root_object_recovery_limit which defaults to 10,000 checks—plenty for legitimate files, but small enough to fail fast on malicious ones.
If you cannot upgrade immediately, there is a configuration workaround. Force the library to be strict. In strict mode, pypdf does not attempt to repair broken files; it simply raises an error. This completely bypasses the vulnerable recovery logic.
# The "I Don't Trust You" Configuration
from pypdf import PdfReader
# strict=True disables the recovery heuristics
reader = PdfReader("untrusted_file.pdf", strict=True)Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N/E:UAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
pypdf py-pdf | < 6.6.0 | 6.6.0 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-400 (Uncontrolled Resource Consumption) |
| CVSS Score | 2.7 (Low) - CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L |
| Attack Vector | Network / Local File |
| Exploit Status | PoC Available |
| Impact | Denial of Service (Thread Hang) |
| Patch Commit | 294165726b646bb7799be1cc787f593f2fdbcf45 |
MITRE ATT&CK Mapping
The software does not properly restrict the size or amount of resources that are requested or influenced by an actor, which can be used to consume more resources than intended.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.