CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Dashboard
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



CVE-2026-28351
6.9

CVE-2026-28351: Uncontrolled Resource Consumption in pypdf RunLengthDecode

Alon Barad
Alon Barad
Software Engineer

Feb 28, 2026·6 min read·9 visits

PoC Available

Executive Summary (TL;DR)

pypdf versions before 6.7.4 contain a vulnerability in the RunLengthDecode filter that allows for unbounded memory allocation. By crafting a PDF with a malformed RLE stream, an attacker can crash the host application via OOM. The fix in version 6.7.4 introduces strict output size limits.

A resource exhaustion vulnerability exists in the pypdf library versions prior to 6.7.4, specifically within the RunLengthDecode filter implementation. The flaw allows attackers to trigger an infinite loop or excessive memory allocation via crafted PDF streams, leading to Denial of Service (DoS) through Out-Of-Memory (OOM) conditions. This issue affects automated PDF processing pipelines where untrusted files are parsed without strict resource limits.

Vulnerability Overview

The pypdf library is a widely used pure-Python PDF toolkit capable of splitting, merging, cropping, and transforming PDF files. It includes various filters to handle the decompression of data streams embedded within PDF documents. One such filter is RunLengthDecode, which implements a simple compression algorithm defined in the ISO 32000-1 specification.

CVE-2026-28351 identifies a critical flaw in how pypdf handles RunLengthDecode streams. The library failed to impose upper bounds on the size of the decompressed output. This omission allows a specifically crafted input stream—often very small in size—to expand into a disproportionately large amount of data in memory. This class of vulnerability is known as a "decompression bomb" or "zip bomb."

When a vulnerable application processes a malicious PDF, the pypdf decoder attempts to allocate memory for the expanded data until system resources are exhausted. This results in an Out-Of-Memory (OOM) crash, rendering the application or the entire host service unavailable. The vulnerability is tracked as CWE-400 (Uncontrolled Resource Consumption).

Root Cause Analysis

The root cause lies in the algorithmic implementation of the Run-Length Encoding (RLE) decoder in pypdf/filters.py. The RLE format uses a control byte n to determine how to process subsequent data. If n is between 129 and 255, the decoder repeats the next byte 257 - n times. This allows for compression of repeated data.

In affected versions, the decode function iterates through the input stream inside a while loop that continues until an End-of-Data (EOD) marker is reached. Crucially, the loop appends decoded bytes to a list (lst) without checking the cumulative size of that list against a safety threshold.

An attacker can exploit this by providing a stream consisting of repeated instructions to duplicate bytes. For example, the sequence 0x81 (decimal 129) followed by a single byte triggers the decoder to output 128 copies of that byte. By chaining these sequences, a small malicious payload can force the interpreter to construct a byte string gigabytes in size, exceeding the available RAM of the process.

Code Analysis

The vulnerability is evident in the RunLengthDecode.decode static method. Below is the comparison between the vulnerable logic and the remediated code in version 6.7.4.

Vulnerable Code (pypdf < 6.7.4)

The loop purely follows the input instructions without any guardrails on len(lst) or the total bytes generated.

# pypdf/filters.py
 
@staticmethod
def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes:
    # ... (initialization)
    while True:
        # ... (read length byte)
        if length < 128:
            # Copy literal bytes
            lst.append(data[index : index + length + 1])
            index += length + 1
        elif length > 128:
            # Repeat next byte (257 - length) times
            length = 257 - length
            lst.append(bytes((data[index],)) * length) # Unbounded allocation
            index += 1
        # ...
    return b"".join(lst)

Patched Code (pypdf >= 6.7.4)

The fix introduces a constant RUN_LENGTH_MAX_OUTPUT_LENGTH (defaulting to 75MB) and tracks the total_length during iteration. If the decoded size exceeds this limit, the operation is aborted.

# pypdf/filters.py
 
# Security constant added
RUN_LENGTH_MAX_OUTPUT_LENGTH = 75_000_000
 
@staticmethod
def decode(data: bytes, parameters: Optional[Dict[str, Any]] = None) -> bytes:
    # ... (initialization)
    total_length = 0  # Accumulator for safety check
    
    while True:
        # ... (parsing logic)
        
        # Update the accumulator
        total_length += length
        
        # Enforce the limit
        if total_length > RUN_LENGTH_MAX_OUTPUT_LENGTH:
            raise LimitReachedError("Limit reached while decompressing.")
            
        # ... (append logic)
    return b"".join(lst)

Exploitation

Exploitation requires the attacker to submit a PDF file where an internal object (typically an image or a content stream) uses the RunLengthDecode filter. This is a standard PDF feature, so the presence of the filter itself is not suspicious.

The attacker constructs a stream payload designed to maximize the expansion ratio. In RLE, the byte 0x81 (decimal 129) is optimal for this purpose, as it commands the decoder to repeat the subsequent byte 128 times (calculated as 257 - 129).

Proof of Concept Logic:

from pypdf.filters import RunLengthDecode
 
# 1. Create a payload where every 2 bytes of input become 128 bytes of output.
#    Expansion ratio: 64:1
runs = 1_000_000 
encoded_payload = (b"\x81A" * runs) + b"\x80"
 
# 2. Input size: ~2 MB
# 3. Target Output size: 128 MB (128 bytes * 1,000,000)
 
# Triggers OOM in vulnerable versions; raises LimitReachedError in fixed versions.
RunLengthDecode.decode(encoded_payload)

In a real-world attack, multiple such streams can be chained or nested to consume memory rapidly. Since pypdf is often used in web backends to process user-uploaded documents (e.g., for resizing, metadata extraction, or OCR preprocessing), a single malicious upload can crash the worker process handling the request.

Impact Assessment

The primary impact of CVE-2026-28351 is Denial of Service (DoS). The vulnerability allows for the exhaustion of system memory resources (RAM).

Operational Impact:

  • Service Unavailability: If the parsing occurs in the main application loop, the entire service will hang or crash. In threaded or multi-process environments (like Celery workers or Gunicorn), the specific worker process will be killed by the OS OOM killer.
  • Resource Starvation: Even if the process does not crash immediately, the excessive swap usage can degrade performance for other co-located services.
  • Automated Pipeline Disruption: Systems that automatically index or process PDFs (e.g., legal discovery platforms, receipt scanners) are highly susceptible, as they process untrusted external input by design.

Severity Metrics:

  • CVSS v4.0: 6.9 (Medium) - AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L
  • CIA Triad: The vulnerability impacts Availability exclusively. Confidentiality and Integrity are not directly compromised, as the attacker cannot read memory or execute arbitrary code beyond causing the crash.

Remediation

The vulnerability is patched in pypdf version 6.7.4. The fix involves a hardcoded safety limit for the RunLengthDecode filter.

Remediation Steps:

  1. Identify Usage: Check your dependency trees (e.g., pip freeze or poetry show --tree) for pypdf versions below 6.7.4.
  2. Upgrade: Update the package via your package manager:
    pip install --upgrade pypdf
  3. Verify: Ensure the installed version is 6.7.4 or higher.

Mitigation / Workarounds: If an immediate upgrade is not feasible, you can monkey-patch the RunLengthDecode.decode method in your application initialization code to include the length check, as shown in the patch analysis. Alternatively, implement strict file size limits on uploaded PDFs, although this is an imperfect defense as the malicious PDF can be small (high compression ratio).

Official Patches

py-pdfPull Request #3664 fixing RunLengthDecode unbounded growth

Fix Analysis (1)

Technical Appendix

CVSS Score
6.9/ 10
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N

Affected Systems

pypdf < 6.7.4

Affected Versions Detail

Product
Affected Versions
Fixed Version
pypdf
py-pdf
< 6.7.46.7.4
AttributeDetail
CWE IDCWE-400
CVSS v4.06.9
Attack VectorNetwork
ImpactDenial of Service (DoS)
Exploit StatusPoC Available
Fix Version6.7.4

MITRE ATT&CK Mapping

T1499.003Endpoint Denial of Service: OS Resource Exhaustion
Impact
CWE-400
Uncontrolled Resource Consumption

The software does not properly control the allocation and maintenance of a limited resource, thereby enabling an actor to influence the amount of resources consumed, eventually leading to the exhaustion of available resources.

Known Exploits & Detection

GitHub Security AdvisoryAdvisory containing PoC logic for RLE expansion.

Vulnerability Timeline

Robustness fix for /Annots handling merged
2026-02-25
Security fix for RunLengthDecode limit merged
2026-02-27
pypdf v6.7.4 released
2026-02-27
Public disclosure of CVE-2026-28351
2026-02-27

References & Sources

  • [1]GHSA-f2v5-7jq9-h8cg
  • [2]PyPI pypdf 6.7.4 Release