libexpat's Pointer Amnesia: A Tale of Missing User Data (CVE-2026-24515)
Feb 4, 2026·6 min read·2 visits
Executive Summary (TL;DR)
libexpat versions < 2.7.4 forget to copy the `userData` pointer when creating a subparser for external entities with unknown encodings. If an application uses a custom encoding handler and accesses that data, it crashes (NULL dereference).
A deep dive into a logic flaw within libexpat's external entity parsing mechanism. Specifically, the library fails to inherit user-context data when creating child parsers for unknown encodings, leading to NULL pointer dereferences in applications that rely on custom encoding handlers. While the CVSS score is low due to high complexity, the bug reveals a fundamental oversight in state management within one of the world's most ubiquitous C libraries.
The Hook: XML, The Gift That Keeps on Giving
If you've been in this industry longer than five minutes, you know that XML parsers are the gift that keeps on giving. From XXE (XML External Entity) attacks that read your /etc/passwd to Billion Laughs attacks that eat your RAM for breakfast, XML has been keeping security researchers employed for decades. Today, we're looking at libexpat, the granddaddy of stream-oriented XML parsers. It's written in C, it's fast, and it is embedded in absolutely everything—from Python's xml.parsers.expat to your browser, and likely the firmware of that smart fridge judging you for eating cheese at 3 AM.
But here's the thing about C libraries: they rely heavily on context pointers (void *userData) to maintain state. Since C isn't object-oriented in the traditional sense, if you want a callback function to know which connection or request it's handling, you have to manually pass a pointer to that state every single time. It's a game of hot potato.
CVE-2026-24515 is what happens when someone drops the potato. It's a NULL Pointer Dereference (CWE-476), which sounds boring until you realize it happens during the complex dance of handling external entities combined with unknown encodings. It's a corner case of a corner case, but it exposes a sloppy logic error in how parsers spawn child parsers. The developers implemented the cloning of the handler function perfectly but completely ghosted the data meant to go with it.
The Flaw: Inheritance is Hard
Let's talk about "Parser Inheritance." In libexpat, when the main parser encounters an external entity (like <!ENTITY x SYSTEM "foo.xml">), it doesn't just read that file inline. It creates a brand new parser instance—a subparser—to handle that external context. This subparser is supposed to be a clone of the parent in terms of configuration. It inherits the handlers, the settings, and crucially, the User Data.
Imagine you are a parser. You have a custom handler for weird character encodings (let's say, EBCDIC or some cursed proprietary format). You also have a pointer to a struct containing your application state (0xCAFEBABE). When you spawn a child parser to handle an external file, you tell the child: "Hey, if you see a weird encoding, use this function." Vulnerable versions of libexpat did exactly that.
However, they forgot the second part of the sentence: "...and here is the state pointer (0xCAFEBABE) you need to do your job." instead, the child parser initializes with the correct function pointer but a NULL data pointer. When the child parser hits an unknown encoding in the external entity, it calls the function. The function expects a valid pointer, tries to read from address 0x0, and the OS kernel steps in to smack the process with a SIGSEGV.
It's like hiring a contractor to paint a house, giving them the address, but forgetting to give them the keys. They show up (the handler is called), try to open the door (dereference the pointer), and hit a wall.
The Code: The One-Line Omission
The vulnerability lives in expat/lib/xmlparse.c, specifically in the function XML_ExternalEntityParserCreate. This function is responsible for birthing the subparser. Let's look at the diff. It is painfully simple, as most devastating C bugs are.
In the vulnerable code, the library diligently copies the m_unknownEncodingHandler function pointer. But notice what is missing immediately after.
// expat/lib/xmlparse.c - XML_ExternalEntityParserCreate
// ... setup code ...
/* The logic copies the handler function... */
parser->m_unknownEncodingHandler = oldParser->m_unknownEncodingHandler;
/* ... but where is the data? */
/* The variable m_unknownEncodingHandlerData is ignored! */
// ... rest of initialization ...The fix, introduced in version 2.7.4, is literally one assignment. This is the difference between a stable application and a denial-of-service vector:
// THE FIX
parser->m_unknownEncodingHandler = oldParser->m_unknownEncodingHandler;
// Added in 2.7.4:
parser->m_unknownEncodingHandlerData =
oldParser->m_unknownEncodingHandlerData;Without this line, parser->m_unknownEncodingHandlerData defaults to NULL (via memset or initialization). The irony here is that libexpat is usually very careful about state. This specific struct member just slipped through the cracks during the cloning process, likely because UnknownEncodingHandler is a rarely customized feature compared to StartElementHandler or CharacterDataHandler.
The Exploit: Crashing the Party
To exploit this, we don't need memory corruption magic or heap spraying. We just need to force the application down a code path where it relies on that missing pointer. The prerequisites are high (CVSS AC:H), but for an attacker targeting a specific appliance or custom server, it's viable.
The Recipe for Disaster:
- Target: Find an app that uses
libexpatand callsXML_SetUnknownEncodingHandlerpassing a non-NULLuserDatapointer. - Vector: The app must parse XML with external entities enabled (default in many older configs, though often disabled now for security).
- Trigger: The external entity must declare an encoding that the parser doesn't recognize natively (e.g.,
encoding='x-user-defined').
Here is the attack flow:
The PoC XML looks innocuous. The main file:
<!DOCTYPE root [
<!ENTITY payload SYSTEM "payload.ent">
]>
<root>&payload;</root>And the referenced payload.ent file which triggers the encoding handler:
<?xml version='1.0' encoding='x-oops'?>
<data>Boom</data>When the parser reads payload.ent, it sees x-oops. It doesn't know what that is. It looks up the handler. It calls the handler. The handler tries to log "Encountered encoding x-oops for user [Pointer]"... and the process dies.
The Impact & Mitigation: Why Panic?
Is this the next Heartbleed? No. It's a NULL dereference, not a buffer overflow or logic bypass allowing RCE. The primary impact is Denial of Service (DoS). However, do not underestimate the annoyance of a crash. If this parser is part of a critical daemon processing XML feeds (like an RSS aggregator, a SOAP backend, or a configuration loader), a single malformed request can take down the service.
The Fix:
Upgrade to libexpat 2.7.4. It was released specifically to address this. If you are a developer using libexpat directly, there is also a defensive coding lesson here: Never trust your pointers.
Even if you know you passed a pointer in setup, your callback should look like this:
int my_encoding_handler(void *encodingHandlerData, ...) {
if (!encodingHandlerData) {
// The parser betrayed us.
return XML_STATUS_ERROR;
}
// Safe to proceed
}This defensive check would neutralize the vulnerability even on the older library version. Trust no one, not even your own libraries.
Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:3.1/AV:L/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:LAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
libexpat libexpat | < 2.7.4 | 2.7.4 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-476 (NULL Pointer Dereference) |
| Attack Vector | Local / Context-dependent (Requires specific app config) |
| CVSS v3.1 | 2.9 (Low) |
| Impact | Denial of Service (Application Crash) |
| EPSS Score | 0.00013 (Low probability of wild exploitation) |
| Likelihood | Low (Requires custom handler + external entities) |
MITRE ATT&CK Mapping
A NULL pointer dereference occurs when the application dereferences a pointer that it expects to be valid, but is NULL, typically causing a crash or exit.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.