CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



CVE-2026-25048
8.7

CVE-2026-25048: Stack Exhaustion Denial of Service in xgrammar EBNF Parser

Alon Barad
Alon Barad
Software Engineer

Mar 5, 2026·5 min read·6 visits

PoC Available

Executive Summary (TL;DR)

The xgrammar library (< 0.1.32) is vulnerable to a remote Denial of Service via stack exhaustion. By submitting a grammar with deeply nested parentheses, an attacker can trigger infinite recursion in the C++ parsing logic, crashing the application.

xgrammar, a library used for structured generation in Large Language Model (LLM) pipelines, contains a critical denial of service vulnerability in its EBNF parser. The issue stems from uncontrolled recursion during the parsing of nested grammar structures. An attacker can supply a crafted grammar string with excessive nesting (e.g., thousands of parentheses), causing the recursive descent parser to consume all available stack memory. This results in a segmentation fault (SIGSEGV) that crashes the host process.

Vulnerability Overview

The xgrammar library is a core component for enforcing structured output formats (such as JSON schemas or specific EBNF grammars) in LLM inference pipelines. It is designed to be portable and efficient, often running within high-performance inference servers. The vulnerability, tracked as CVE-2026-25048, resides in the C++ core of the library, specifically within the EBNFParser class. This component is responsible for tokenizing and parsing grammar definitions provided by users.

The flaw is a classic stack exhaustion issue caused by uncontrolled recursion (CWE-674). When the parser encounters nested structures, such as parenthetical groups in an EBNF grammar, it processes them using a recursive descent algorithm. Prior to version 0.1.32, the implementation lacked a mechanism to limit the depth of this recursion. Consequently, input containing deeply nested elements forces the allocation of stack frames beyond the operating system's limits, leading to an immediate process crash.

This vulnerability is particularly significant in multi-tenant LLM environments. If a service allows end-users to supply custom grammar constraints (a common feature for guiding model output), a single malicious request can terminate the inference service, disrupting availability for all users.

Root Cause Analysis

The root cause lies in the recursive invocation chain within cpp/grammar_parser.cc. The EBNFParser class parses grammar rules by breaking them down into elements, terms, sequences, and choices. The parsing logic for handling parentheses created a cycle that did not enforce a termination condition based on depth.

The specific recursion cycle involves the following function calls:

  1. ParseElement: Encounters a TokenType::LParen (left parenthesis).
  2. ParseChoices: Called to handle the content within the parentheses.
  3. ParseSequence: Called by ParseChoices to handle a sequence of terms.
  4. ParseTerm: Called by ParseSequence.
  5. ParseElement: Called again by ParseTerm to handle the next nested element.

Each iteration of this cycle pushes a new frame onto the call stack. In standard Linux environments, the default stack size is often limited (e.g., 8MB). A grammar string containing 30,000 nested parentheses requires stack space significantly exceeding this limit. Since the parser did not track nest_layer_guard_ or check against a max_nest_layer_ threshold, the execution continues until the stack pointer exceeds the mapped memory region, triggering a segmentation fault.

Code Analysis

The remediation for this vulnerability involved introducing a depth-tracking mechanism to the EBNFParser. The maintainers added a nest_layer_guard_ integer to track current depth and a max_nest_layer_ constant (defaulting to 1000) to enforce a safety limit.

The following code comparison illustrates the vulnerability and the fix in cpp/grammar_parser.cc:

Vulnerable Code (Prior to 0.1.32): Note the absence of depth checks before the recursive call to ParseChoices.

// Inside ParseElement()
if (Peek().type == TokenType::LParen) {
    Consume(TokenType::LParen);
    // UNCHECKED RECURSION
    auto choices = ParseChoices(); 
    Consume(TokenType::RParen);
    // ... processing ...
}

Patched Code (Version 0.1.32): The fix introduces a guard that increments on entry and decrements on exit. If the limit is reached, a specific error is raised instead of crashing.

// Inside ParseElement()
if (Peek().type == TokenType::LParen) {
    // FIX: Increment depth counter
    nest_layer_guard_++;
    
    // FIX: Check against maximum limit (1000)
    if (nest_layer_guard_ > max_nest_layer_) {
        ReportParseError("Nest layer exceeded the maximum limit", -1);
    }
    
    Consume(TokenType::LParen);
    auto choices = ParseChoices();
    Consume(TokenType::RParen);
    
    // FIX: Decrement depth counter
    nest_layer_guard_--;
    // ... processing ...
}

This change ensures that any attempt to nest beyond 1,000 layers results in a controlled ParseError rather than uncontrolled memory corruption.

Exploitation Methodology

Exploiting this vulnerability is trivial and requires no authentication if the target application exposes the grammar compilation feature to unprivileged users. The attack vector is strictly data-driven: the attacker constructs a string where the nesting depth exceeds the stack capacity.

The Proof of Concept (PoC) utilizes the Python bindings of xgrammar to pass a malicious string to the underlying C++ parser. The payload consists of a sequence of opening parentheses ( followed by a closing term.

import xgrammar as xgr
 
# 1. Payload Construction
# 30,000 layers is sufficient to crash a standard 8MB stack.
payload = '(' * 30000 + 'a' 
 
# 2. Grammar Definition
grammar = f"root ::= {payload}"
 
# 3. Trigger
# The compilation step invokes the vulnerable C++ ParseElement function.
# Result: Segmentation fault (core dumped)
compiler.compile_grammar(grammar)

In a real-world scenario, an attacker would submit this grammar via an API endpoint designed to accept JSON schemas or EBNF definitions for guiding LLM output. Upon processing the request, the worker process hosting the model would crash immediately. If the service lacks robust supervisor processes or restart logic, the service remains down. Even with auto-restart, continuous submission of this payload causes a persistent denial of service loop.

Impact Assessment

The primary impact of CVE-2026-25048 is Availability. The vulnerability allows for a highly asymmetric attack: a payload of a few kilobytes can crash a server process that may be managing gigabytes of GPU memory and state.

Severity Factors:

  • CVSS v4.0 Score: 8.7 (High).
  • Attack Vector: Network (AV:N). The grammar is typically supplied via API.
  • Privileges: None (PR:N). Often exposed to public users interacting with LLMs.
  • Confidentiality/Integrity: None. The crash occurs before memory can be read or written arbitrarily.

While this vulnerability does not inherently allow for Remote Code Execution (RCE), stack overflows in C++ can theoretically be chained with other memory corruption primitives. However, given the nature of the crash (exhausting the guard page), it is purely a Denial of Service vector in this context. For organizations running inference-as-a-service, this represents a significant stability risk.

Official Patches

mlc-aiGitHub Commit: feat: limit nest layer of ebnf parsing

Fix Analysis (1)

Technical Appendix

CVSS Score
8.7/ 10
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:H/SC:N/SI:N/SA:N

Affected Systems

xgrammar < 0.1.32MLC LLM pipelines using custom grammarsStructured generation services relying on mlc-ai/xgrammar

Affected Versions Detail

Product
Affected Versions
Fixed Version
xgrammar
mlc-ai
< 0.1.320.1.32
AttributeDetail
CWE IDCWE-674
Vulnerability TypeStack Exhaustion
CVSS v4.08.7 (High)
Attack VectorNetwork
Attack ComplexityLow
ImpactDenial of Service

MITRE ATT&CK Mapping

T1499Endpoint Denial of Service
Impact
T1499.003Application Exhaustion Flood
Impact
CWE-674
Uncontrolled Recursion

The software executes a recursive function that does not have a base case or has a base case that is not reached, leading to infinite recursion and stack exhaustion.

Known Exploits & Detection

GitHub Security AdvisoryPython PoC provided in advisory demonstrating 30,000 layer recursion crash.

Vulnerability Timeline

Fix commit authored
2026-01-29
Version 0.1.32 released
2026-03-04
Vulnerability publicly disclosed
2026-03-05

References & Sources

  • [1]GHSA-7rgv-gqhr-fxg3
  • [2]NVD - CVE-2026-25048

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.