The root cause lies in the algorithmic implementation of the Run-Length Encoding (RLE) decoder in pypdf/filters.py. The RLE format uses a control byte n to determine how to process subsequent data. If n is between 129 and 255, the decoder repeats the next byte 257 - n times. This allows for compression of repeated data.

In affected versions, the decode function iterates through the input stream inside a while loop that continues until an End-of-Data (EOD) marker is reached. Crucially, the loop appends decoded bytes to a list (lst) without checking the cumulative size of that list against a safety threshold.

An attacker can exploit this by providing a stream consisting of repeated instructions to duplicate bytes. For example, the sequence 0x81 (decimal 129) followed by a single byte triggers the decoder to output 128 copies of that byte. By chaining these sequences, a small malicious payload can force the interpreter to construct a byte string gigabytes in size, exceeding the available RAM of the process.

# pypdf/filters.py @staticmethod def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes: # ... (initialization) while True: # ... (read length byte) if length < 128: # Copy literal bytes lst.append(data[index : index + length + 1]) index += length + 1 elif length > 128: # Repeat next byte (257 - length) times length = 257 - length lst.append(bytes((data[index],)) * length) # Unbounded allocation index += 1 # ... return b"".join(lst)

# pypdf/filters.py # Security constant added RUN_LENGTH_MAX_OUTPUT_LENGTH = 75_000_000 @staticmethod def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes: # ... (initialization) total_length = 0 # Accumulator for safety check while True: # ... (parsing logic) # Update the accumulator total_length += length # Enforce the limit if total_length > RUN_LENGTH_MAX_OUTPUT_LENGTH: raise LimitReachedError("Limit reached while decompressing.") # ... (append logic) return b"".join(lst)

from pypdf.filters import RunLengthDecode # 1. Create a payload where every 2 bytes of input become 128 bytes of output. # Expansion ratio: 64:1 runs = 1_000_000 encoded_payload = (b"\x81A" * runs) + b"\x80" # 2. Input size: ~2 MB # 3. Target Output size: 128 MB (128 bytes * 1,000,000) # Triggers OOM in vulnerable versions; raises LimitReachedError in fixed versions. RunLengthDecode.decode(encoded_payload)

Product

Affected Versions

Fixed Version

pypdf

py-pdf

< 6.7.4

6.7.4

Attribute

Detail

CWE ID

CWE-400

CVSS v4.0

6.9

Attack Vector

Network

Impact

Denial of Service (DoS)

Exploit Status

PoC Available

Fix Version

6.7.4

CVE-2026-28351

CVE-2026-28351: Uncontrolled Resource Consumption in pypdf RunLengthDecode

Alon Barad

Software Engineer

Feb 28, 2026·6 min read·45 visits

Executive Summary (TL;DR)

pypdf versions before 6.7.4 contain a vulnerability in the RunLengthDecode filter that allows for unbounded memory allocation. By crafting a PDF with a malformed RLE stream, an attacker can crash the host application via OOM. The fix in version 6.7.4 introduces strict output size limits.

A resource exhaustion vulnerability exists in the pypdf library versions prior to 6.7.4, specifically within the RunLengthDecode filter implementation. The flaw allows attackers to trigger an infinite loop or excessive memory allocation via crafted PDF streams, leading to Denial of Service (DoS) through Out-Of-Memory (OOM) conditions. This issue affects automated PDF processing pipelines where untrusted files are parsed without strict resource limits.

Vulnerability Overview

The pypdf library is a widely used pure-Python PDF toolkit capable of splitting, merging, cropping, and transforming PDF files. It includes various filters to handle the decompression of data streams embedded within PDF documents. One such filter is RunLengthDecode, which implements a simple compression algorithm defined in the ISO 32000-1 specification.

CVE-2026-28351 identifies a critical flaw in how pypdf handles RunLengthDecode streams. The library failed to impose upper bounds on the size of the decompressed output. This omission allows a specifically crafted input stream—often very small in size—to expand into a disproportionately large amount of data in memory. This class of vulnerability is known as a "decompression bomb" or "zip bomb."

When a vulnerable application processes a malicious PDF, the pypdf decoder attempts to allocate memory for the expanded data until system resources are exhausted. This results in an Out-Of-Memory (OOM) crash, rendering the application or the entire host service unavailable. The vulnerability is tracked as CWE-400 (Uncontrolled Resource Consumption).

Root Cause Analysis

Code Analysis

The vulnerability is evident in the RunLengthDecode.decode static method. Below is the comparison between the vulnerable logic and the remediated code in version 6.7.4.

Vulnerable Code (pypdf < 6.7.4)

The loop purely follows the input instructions without any guardrails on len(lst) or the total bytes generated.

# pypdf/filters.py
 
@staticmethod
def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes:
    # ... (initialization)
    while True:
        # ... (read length byte)
        if length < 128:
            # Copy literal bytes
            lst.append(data[index : index + length + 1])
            index += length + 1
        elif length > 128:
            # Repeat next byte (257 - length) times
            length = 257 - length
            lst.append(bytes((data[index],)) * length) # Unbounded allocation
            index += 1
        # ...
    return b"".join(lst)

Patched Code (pypdf >= 6.7.4)

The fix introduces a constant RUN_LENGTH_MAX_OUTPUT_LENGTH (defaulting to 75MB) and tracks the total_length during iteration. If the decoded size exceeds this limit, the operation is aborted.

# pypdf/filters.py
 
# Security constant added
RUN_LENGTH_MAX_OUTPUT_LENGTH = 75_000_000
 
@staticmethod
def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes:
    # ... (initialization)
    total_length = 0  # Accumulator for safety check
    
    while True:
        # ... (parsing logic)
        
        # Update the accumulator
        total_length += length
        
        # Enforce the limit
        if total_length > RUN_LENGTH_MAX_OUTPUT_LENGTH:
            raise LimitReachedError("Limit reached while decompressing.")
            
        # ... (append logic)
    return b"".join(lst)

Exploitation

Exploitation requires the attacker to submit a PDF file where an internal object (typically an image or a content stream) uses the RunLengthDecode filter. This is a standard PDF feature, so the presence of the filter itself is not suspicious.

The attacker constructs a stream payload designed to maximize the expansion ratio. In RLE, the byte 0x81 (decimal 129) is optimal for this purpose, as it commands the decoder to repeat the subsequent byte 128 times (calculated as 257 - 129).

Proof of Concept Logic:

from pypdf.filters import RunLengthDecode
 
# 1. Create a payload where every 2 bytes of input become 128 bytes of output.
#    Expansion ratio: 64:1
runs = 1_000_000 
encoded_payload = (b"\x81A" * runs) + b"\x80"
 
# 2. Input size: ~2 MB
# 3. Target Output size: 128 MB (128 bytes * 1,000,000)
 
# Triggers OOM in vulnerable versions; raises LimitReachedError in fixed versions.
RunLengthDecode.decode(encoded_payload)

In a real-world attack, multiple such streams can be chained or nested to consume memory rapidly. Since pypdf is often used in web backends to process user-uploaded documents (e.g., for resizing, metadata extraction, or OCR preprocessing), a single malicious upload can crash the worker process handling the request.

Impact Assessment

The primary impact of CVE-2026-28351 is Denial of Service (DoS). The vulnerability allows for the exhaustion of system memory resources (RAM).

Operational Impact:

Service Unavailability: If the parsing occurs in the main application loop, the entire service will hang or crash. In threaded or multi-process environments (like Celery workers or Gunicorn), the specific worker process will be killed by the OS OOM killer.
Resource Starvation: Even if the process does not crash immediately, the excessive swap usage can degrade performance for other co-located services.
Automated Pipeline Disruption: Systems that automatically index or process PDFs (e.g., legal discovery platforms, receipt scanners) are highly susceptible, as they process untrusted external input by design.

Severity Metrics:

CVSS v4.0: 6.9 (Medium) - AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L
CIA Triad: The vulnerability impacts Availability exclusively. Confidentiality and Integrity are not directly compromised, as the attacker cannot read memory or execute arbitrary code beyond causing the crash.

Remediation

The vulnerability is patched in pypdf version 6.7.4. The fix involves a hardcoded safety limit for the RunLengthDecode filter.

Remediation Steps:

Identify Usage: Check your dependency trees (e.g., pip freeze or poetry show --tree) for pypdf versions below 6.7.4.
Upgrade: Update the package via your package manager:
```
pip install --upgrade pypdf
```
Verify: Ensure the installed version is 6.7.4 or higher.

Mitigation / Workarounds: If an immediate upgrade is not feasible, you can monkey-patch the RunLengthDecode.decode method in your application initialization code to include the length check, as shown in the patch analysis. Alternatively, implement strict file size limits on uploaded PDFs, although this is an imperfect defense as the malicious PDF can be small (high compression ratio).

Official Patches

py-pdfPull Request #3664 fixing RunLengthDecode unbounded growth

Fix Analysis (1)

Technical Appendix

CVSS Score

6.9/ 10

CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N

Affected Systems

pypdf < 6.7.4

Affected Versions Detail

Product	Affected Versions	Fixed Version
pypdf py-pdf	< 6.7.4	6.7.4

Attribute	Detail
CWE ID	CWE-400
CVSS v4.0	6.9
Attack Vector	Network
Impact	Denial of Service (DoS)
Exploit Status	PoC Available
Fix Version	6.7.4

MITRE ATT&CK Mapping

T1499.003Endpoint Denial of Service: OS Resource Exhaustion

Impact

CWE-400

Uncontrolled Resource Consumption

The software does not properly control the allocation and maintenance of a limited resource, thereby enabling an actor to influence the amount of resources consumed, eventually leading to the exhaustion of available resources.

Known Exploits & Detection

GitHub Security AdvisoryAdvisory containing PoC logic for RLE expansion.

Vulnerability Timeline

Robustness fix for /Annots handling merged

2026-02-25

Security fix for RunLengthDecode limit merged

2026-02-27

pypdf v6.7.4 released

2026-02-27

Public disclosure of CVE-2026-28351

2026-02-27

More Reports

•5 days ago•CVE-2026-9354

6.9

CVE-2026-9354: Arbitrary Mass Mention Bypass in NousResearch hermes-agent Slack and Mattermost Adapters

A vulnerability in the Slack and Mattermost platform adapters for NousResearch hermes-agent permits an unauthenticated remote attacker to execute arbitrary mass mentions. By leveraging prompt injection, an attacker can bypass output sanitization logic and trigger workspace-wide notification exhaustion.

Alon Barad

33 views•6 min read

•5 days ago•CVE-2026-9306

6.3

CVE-2026-9306: Unauthenticated Insecure Direct Object Reference (IDOR) in QuantumNous new-api Midjourney Relay

CVE-2026-9306 is a critical unauthenticated Insecure Direct Object Reference (IDOR) vulnerability located in the QuantumNous new-api application, affecting versions up to and including 0.12.1. The flaw is caused by improper middleware ordering combined with a lack of object-level authorization checks. This allows remote, unauthenticated attackers to retrieve sensitive Midjourney images belonging to other users by supplying a valid task identifier.

Amit Schendel

13 views•5 min read

•6 days ago•GHSA-GGXF-37HM-9WQF

6.5

GHSA-GGXF-37HM-9WQF: Session Leakage via Unsafe Challenge Path Parsing in instagrapi

The instagrapi library prior to version 2.6.9 contains an improper input validation vulnerability within its challenge handling mechanism. Maliciously crafted server responses can manipulate the client into forwarding session cookies and credentials to an external attacker-controlled domain.

Amit Schendel

21 views•6 min read

•7 days ago•GHSA-QQQM-5547-774X

9.1

GHSA-QQQM-5547-774X: Unauthenticated Path Traversal in FileBrowser Quantum PATCH Handler

GHSA-QQQM-5547-774X is a critical path traversal vulnerability in the FileBrowser Quantum application, specifically within the Go backend package. The vulnerability resides in the HTTP handler responsible for processing bulk file modifications via the public API. Unauthenticated attackers can exploit an order-of-operations flaw in the path sanitization logic to bypass intended directory restrictions. This allows adversaries to arbitrarily read, move, and overwrite files on the underlying filesystem by supplying specially crafted HTTP PATCH requests.

Alon Barad

9 views•6 min read

•7 days ago•CVE-2026-8723

5.3

CVE-2026-8723: Synchronous Denial of Service in qs npm Package via TypeError

The qs query string parsing and serialization library for Node.js is vulnerable to a synchronous Denial of Service (DoS) attack. The vulnerability manifests as a process-terminating TypeError when processing arrays with null or undefined elements under specific configuration parameters.

Amit Schendel

36 views•7 min read

•7 days ago•GHSA-7M8F-HGJQ-8GC9

7.5

GHSA-7M8F-HGJQ-8GC9: Pre-Authentication Denial of Service via Insecure Deserialization Order in aiosend

The aiosend library prior to version 3.0.6 contains a pre-authentication Denial of Service (DoS) vulnerability in its webhook handling mechanism. The software processes and deserializes incoming JSON payloads before verifying the cryptographic signature, allowing unauthenticated attackers to exhaust server CPU and memory resources by sending large, complex payloads.

Amit Schendel

4 views•6 min read