GuardDog Down: The Irony of Safety Tools Choking on Zip Bombs
Jan 14, 2026·5 min read
Executive Summary (TL;DR)
GuardDog versions prior to 2.7.1 contain a vulnerability in the `safe_extract()` function where highly compressed ZIP archives are unpacked without size validation. This allows an attacker to trigger a Denial of Service (DoS) via disk exhaustion. The fix involves pre-calculating uncompressed sizes and enforcing compression ratios.
DataDog's GuardDog, a tool meant to protect developers from malicious packages, was itself vulnerable to a classic denial-of-service attack: the Zip Bomb. By feeding it a specially crafted archive, an attacker could force the scanner to exhaust all available disk space, crashing CI/CD pipelines and freezing development environments.
The Hook: Who Guards the GuardDog?
In the modern software supply chain, we are terrified of "typosquatting" and malicious PyPI packages. Tools like DataDog GuardDog are the sheepdogs of this ecosystem. They prowl through package repositories, downloading wheels and eggs, ripping them open, and sniffing for malware signatures. It’s noble work. It’s necessary work.
But here is the cosmic irony: in its eagerness to inspect potentially dangerous packages, GuardDog forgot to ask a very basic question: "Is this package physically possible to open without exploding the server?"
CVE-2026-22870 isn't a complex buffer overflow or a wizard-level heap grooming exploit. It is the digital equivalent of a clown car. You invite a small, harmless-looking file into your living room (or CI runner), and suddenly 400 terabytes of clowns—or specifically, null bytes—come pouring out until the walls burst. This is a classic Zip Bomb (CWE-409), and it took down the very tool designed to keep us safe.
The Flaw: Python's Trust Issues
The vulnerability lived in guarddog/utils/archives.py. The function was ironically named safe_extract(). If you've been in this industry long enough, seeing a function named "safe_anything" usually makes you reach for a stiff drink. It implies the developer knew there was danger, tried to handle it, and likely missed the edge cases.
Python's standard zipfile library is powerful, but it is not defensive. It assumes that if you ask it to extract a file, you have the disk space to handle it. GuardDog's implementation blindly iterated through the ZIP archive and extracted files to a temporary directory so it could run its heuristics.
Here is the logic gap: The DEFLATE compression algorithm is incredibly efficient at compressing repeated data. A file consisting of 10 gigabytes of zeros can be compressed down to a few kilobytes. GuardDog didn't check the uncompressed size of the files in the Central Directory before starting the extraction. It just started writing. And writing. And writing. Until ENOSPC (Error No Space Left on Device) screamed into the void.
The Code: The Smoking Gun
Let's look at the crime scene. Prior to version 2.7.1, the extraction logic was essentially a while loop of doom. It lacked the necessary brakes to stop a runaway decompression train.
The Vulnerable Logic: It essentially said: "For every file in this zip, extract it." It did not track cumulative size, nor did it check the compression ratio.
The Fix (Commit c3fb07b):
The patch introduces a new sheriff in town: _check_compression_bomb. Before a single byte is written to disk, GuardDog now performs a dry run over the ZIP headers. It enforces three critical sanity checks:
- Total Size Limit: If the sum of all uncompressed file sizes exceeds
MAX_UNCOMPRESSED_SIZE. - File Count Limit: If the archive contains millions of tiny files (inode exhaustion).
- Compression Ratio: If the data expands by a factor greater than
MAX_COMPRESSION_RATIO(usually around 100x).
Here is the logic that saved the day:
# The new safety valve
if total_size > MAX_UNCOMPRESSED_SIZE:
raise ValueError(f"Archive uncompressed size ({total_size} bytes) exceeds maximum allowed size.")
if archive_size > 0:
compression_ratio = total_size / archive_size
if compression_ratio > MAX_COMPRESSION_RATIO:
raise ValueError(f"Archive compression ratio ({compression_ratio:.1f}:1) exceeds maximum allowed.")This is defensive programming 101: Never trust user input, especially when that input is a compressed blob of mystery data.
The Exploit: Building the Bomb
Exploiting this is trivially easy and deeply satisfying in a chaotic neutral sort of way. You don't need shellcode. You don't need ROP gadgets. You just need a text editor and a zipper.
The Attack Recipe:
- Create a file full of zeros. In Linux, we use
dd.dd if=/dev/zero of=bomb.txt bs=1G count=10(Creates a 10GB file of nothing). - Compress it.
zip --D exploit.zip bomb.txt(This will shrink to a few MBs). - Host this file as a PyPI package or point GuardDog at it.
The Impact:
When the victim runs guarddog scan exploit.zip, the tool reads the header, sees a valid zip structure, and begins extraction to /tmp/. The Python process will consume 100% of the I/O bandwidth writing zeros to the disk.
In a containerized environment (like a GitHub Action runner or a Kubernetes pod), the ephemeral storage is often limited (e.g., 10GB-50GB). The disk fills up in seconds. The OS throws a panic. The pipeline fails. If the runner is shared, you've just denied service to every other job on that node.
The Mitigation: Putting a Leash on It
The immediate fix is to upgrade guarddog to version 2.7.1. This version includes the patch that validates archive headers before extraction.
But the broader lesson here is for anyone parsing files in Python (or any language, really). If you are accepting archives from untrusted sources, you are accepting a potential bomb. You must implement resource limits:
- Peek before you poke: Read the file headers to estimate the output size.
- Stream, don't dump: If possible, process files in memory streams with hard limits rather than writing them to disk.
- Resource Quotas: Run your scanners in sandboxes with strict
ulimitsettings for file size and CPU time. Don't let a Python script eat your entire filesystem.
Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:H/SC:N/SI:N/SA:NAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
guarddog DataDog | < 2.7.1 | 2.7.1 |
| Attribute | Detail |
|---|---|
| CWE | CWE-409 (Improper Handling of Highly Compressed Data) |
| CVSS v4.0 | 7.1 (High) |
| Attack Vector | Network |
| Impact | Denial of Service (Disk Exhaustion) |
| Patch Commit | c3fb07b4838945f42497e78b7a02bcfb1e63969b |
| Vulnerable Function | safe_extract() |
MITRE ATT&CK Mapping
The software does not properly handle data that is compressed or encoded in a way that allows for a high ratio of compression, enabling an attacker to cause a denial of service by consuming excessive resources.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.