CVEReports
Reports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Reports
  • Sitemap
  • RSS Feed

Company

  • About
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Powered by Google Gemini & CVE Feed

|
•

GHSA-P4F6-H8JJ-VFVF
CVSS 5.3|EPSS 0.06%

Infinite Darkness: ReDoS in Black Formatter

Alon Barad
Alon Barad
Software Engineer•January 2, 2026•6 min read
PoC Available

Executive Summary (TL;DR)

Black versions prior to 24.3.0 are vulnerable to Regular Expression Denial of Service (ReDoS) when processing docstrings with specific whitespace patterns. An attacker can craft a file with thousands of tabs that causes the formatter to hang indefinitely, consuming 100% CPU. The fix involves replacing the regex engine with simple linear string manipulation.

The world's most uncompromising Python code formatter compromised its own availability with a greedy regex. A deep dive into how overlapping character classes caused catastrophic backtracking in Black.

The Hook: Any Color You Like, As Long As It Hangs

Black describes itself as 'The uncompromising Python code formatter.' It is the gold standard in the Python ecosystem, trusted by heavyweights like Dropbox, Mozilla, and Instagram to end bike-shedding debates over code style. You run black ., and your code transforms into a uniform, PEP-8 compliant masterpiece. It is supposed to save time. It is supposed to be safe.

But here is the irony: the tool designed to rigidly enforce structure was brought to its knees by an ambiguous definition of whitespace. In CVE-2024-21503, we find a classic Regular Expression Denial of Service (ReDoS) vulnerability buried in the logic that handles—of all things—docstring indentation.

This isn't a memory corruption bug or a remote code execution via pickle deserialization. It is a logic flaw where the parser gets lost in a maze of its own making. By feeding Black a Python file containing a specific sequence of tabs and spaces, an attacker can force the formatter into an infinite loop of CPU cycles, effectively freezing any CI/CD pipeline, pre-commit hook, or web service relying on it. It turns out, being uncompromising on style requires a very compromising regex.

The Flaw: A Tale of Three Quantifiers

To understand this bug, you have to understand why Regular Expressions (regex) are often the wrong tool for the job. The vulnerability resides in src/black/strings.py, inside a function meant to expand tabs in docstrings while preserving relative indentation. The developers needed to find the first non-whitespace character in a line that might start with a mix of tabs and spaces.

They chose this regex:

FIRST_NON_WHITESPACE_RE = re.compile(r"\s*\t+\s*(\S)")

At first glance, it looks harmless. It looks for whitespace (\s*), followed by at least one tab (\t+), followed by more whitespace (\s*), and finally captures a non-whitespace character ((\S)).

Here is the trap: In Python's regex engine (and many others), the character class \s (whitespace) includes \t (tab). This creates an ambiguity. If the engine sees a sequence like \t\t\t, it doesn't know if that second tab belongs to the first \s*, the middle \t+, or the final \s*.

When the regex engine encounters a string of thousands of tabs without a final non-whitespace character (which is required by the (\S) group at the end), it must backtrack. It tries to assign the first tab to group 1, then group 2. Then it tries assigning the first two tabs to group 1, and so on. The complexity explodes exponentially. This is known as Catastrophic Backtracking.

The Code: The Smoking Gun

Let's look at the vulnerable code in src/black/strings.py. The function lines_with_leading_tabs_expanded iterates over lines of a docstring and applies the regex.

# Vulnerable implementation in Black < 24.3.0
def lines_with_leading_tabs_expanded(s: str) -> List[str]:
    lines = []
    for line in s.splitlines():
        # This match attempt is where the CPU dies
        match = FIRST_NON_WHITESPACE_RE.match(line)
        if match:
            first_non_whitespace_idx = match.start(1)
            lines.append(
                line[:first_non_whitespace_idx].expandtabs()
                + line[first_non_whitespace_idx:]
            )
        else:
            lines.append(line)
    return lines

The issue is strictly in FIRST_NON_WHITESPACE_RE.match(line). Because the regex is anchored to find a specific structure (tabs sandwiched by whitespace ending in a visible char), an input that almost matches but fails at the very end causes the engine to explore every possible permutation of the "sandwich" before giving up.

If you have 5,000 tabs, the number of permutations is astronomical. The computer isn't frozen; it's just working very, very hard on a problem that has no solution.

The Exploit: The Billion Tab Attack

Exploiting this is trivially easy. You don't need shellcode, you don't need memory addresses, and you don't need network access. You just need to convince a developer or a server to format a file.

The Scenario

Imagine a large open-source project that enforces Black formatting via GitHub Actions. An attacker submits a Pull Request adding a "documentation update." The file contains a docstring with a malicious payload.

The Payload

# poc.py
"""
Here is a docstring that will never finish formatting.
 
" + ("\t" * 10000) + "
"""

When the CI runner executes black ., it hits this file. It reaches the line with 10,000 tabs. It attempts to match FIRST_NON_WHITESPACE_RE. The regex engine enters a state of deep meditation. The CI job hangs until it hits the platform's timeout (often 6 hours). If the attacker does this across multiple PRs or repositories, they can burn through the victim's compute credits or DOS their build infrastructure.

Verification

Here is a script to verify the hang locally (do not run this on production systems):

import re
import time
 
# The vulnerable regex
REGEX = re.compile(r"\s*\t+\s*(\S)")
 
# The malicious input: lots of tabs, NO non-whitespace char at the end
payload = "\t" * 5000
 
print("Attempting match... (Press Ctrl+C to abort)")
start = time.time()
try:
    REGEX.match(payload)
except KeyboardInterrupt:
    print("\nAborted!")
 
print(f"Finished in {time.time() - start}s")

The Fix: Abandoning Regex

The remediation for this vulnerability is a perfect example of "Keep It Simple, Stupid." The Black maintainers realized that using a regex to find the first non-whitespace character was overkill. Python strings have built-in methods for this that are implemented in C and run in linear time.

In version 24.3.0, they completely removed the regex logic. Here is the diff from commit f00093672628d212b8965a8993cee8bedf5fe9b8:

-        match = FIRST_NON_WHITESPACE_RE.match(line)
-        if match:
-            first_non_whitespace_idx = match.start(1)
-            lines.append(
-                line[:first_non_whitespace_idx].expandtabs()
-                + line[first_non_whitespace_idx:]
-            )
+        stripped_line = line.lstrip()
+        if not stripped_line or stripped_line == line:
+            lines.append(line)
+        else:
+            prefix_length = len(line) - len(stripped_line)
+            prefix = line[:prefix_length].expandtabs()
+            lines.append(prefix + stripped_line)

Instead of a backtracking nightmare, they now just lstrip() the line. This removes all leading whitespace. By subtracting the length of the stripped line from the original line, they get the index of the first real character. Simple math. O(N) complexity. No backtracking.

This fix serves as a reminder to developers: just because you can solve it with a regex doesn't mean you should.

Official Patches

GitHubBlack 24.3.0 Release Notes

Fix Analysis (1)

Technical Appendix

CVSS Score
5.3/ 10
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:L
EPSS Probability
0.06%
Top 81% most exploited

Affected Systems

Python CI/CD pipelinesGitHub Actions using Blackpre-commit hooksWeb-based Python formattersIDEs bundling older versions of Black

Affected Versions Detail

ProductAffected VersionsFixed Version
black
Python Software Foundation
< 24.3.024.3.0
AttributeDetail
CWECWE-1333 (Inefficient Regular Expression Complexity)
CVSS v3.15.3 (Medium)
Attack VectorNetwork / Local (via file)
ComplexityLow
EPSS Score0.06%
ImpactDenial of Service (Availability)
Exploit MaturityProof of Concept Available

MITRE ATT&CK Mapping

MITRE ATT&CK Mapping

T1499Endpoint Denial of Service
Impact
T1499.003Application Exhaustion Flood
Impact
CWE-1333
Inefficient Regular Expression Complexity

The software uses a regular expression that can be forced to process input in exponential time, leading to Denial of Service.

Exploit Resources

Known Exploits & Detection

GitHub Security AdvisoryAdvisory containing the explanation of the ReDoS vector

Vulnerability Timeline

Vulnerability Timeline

Fix commit merged to master
2024-03-15
CVE Published
2024-03-19
Black 24.3.0 Released
2024-03-19

References & Sources

  • [1]GitHub Advisory
  • [2]NVD Entry
Related Vulnerabilities
CVE-2024-21503

Subscribe to updates

Get the latest CVE analysis reports delivered to your inbox.

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.