Apr 14, 2026·6 min read·4 visits
A path traversal vulnerability in gdown < 5.2.2 allows attackers to overwrite arbitrary files when a victim extracts a maliciously crafted archive containing relative path components (`../`), potentially leading to remote code execution.
The Python package `gdown` prior to version 5.2.2 is vulnerable to an arbitrary file write flaw via a path traversal vulnerability in the `gdown.extractall` function. When extracting maliciously crafted ZIP or TAR archives containing relative path components (such as `../`), the extraction process writes files outside the intended destination directory. Exploiting this vulnerability requires user interaction to process the crafted archive, but successful exploitation yields arbitrary file overwrite capabilities, which an attacker can leverage for remote code execution or persistence.
The gdown utility provides a convenient interface for downloading and extracting files from Google Drive. A core component of its functionality involves processing compressed archives via the gdown.extractall function. This function abstracts the underlying Python standard library mechanisms for parsing and extracting .zip and .tar files.
The vulnerability manifests as an Improper Limitation of a Pathname to a Restricted Directory (CWE-22). When the extractall function processes an archive, it implicitly trusts the internal paths defined for each file member. If an archive contains members with directory traversal sequences (e.g., ../), the extraction mechanism applies these sequences relative to the target extraction directory.
Because the extraction function executes within the context of the user running gdown, the arbitrary file write is constrained only by the operating system permissions assigned to that user. An attacker cannot force extraction remotely without user interaction, but delivering a maliciously constructed archive to a target developer or automated pipeline satisfies the execution prerequisite.
The root cause resides in the implementation of the extractall method within gdown/extractall.py, which directly invokes the standard Python tarfile.extractall() and zipfile.extractall() functions without secondary sanitization. Python's standard library archive extractors historically extract files precisely as specified in the archive metadata, including absolute paths or relative upward traversal sequences.
While Python 3.12 introduced the filter parameter for tarfile.extractall to mitigate this exact class of vulnerability (often referred to as the "TarSlip" vulnerability), packages maintaining backward compatibility with older Python versions must implement manual boundary checks. The gdown implementation failed to validate the computed destination path of each extracted member against the base extraction directory.
The following diagram illustrates the vulnerable execution flow when processing a malicious payload:
By lacking a common path verification mechanism, gdown delegates trust to the archive contents. The operating system resolves /target/../escape.txt to the parent directory of /target, executing the path traversal.
The vulnerable implementation in gdown/extractall.py instantiates an archive opener and immediately dispatches the extraction command. The code lacks any inspection of the individual members before calling extractall().
# gdown/extractall.py (Vulnerable Implementation)
def extractall(path, to=None):
# ... (omitted) ...
with opener(path, mode) as f:
f.extractall(path=to) # Vulnerable: No path validation or filtersTo remediate this, version 5.2.2 introduces an explicit path containment check. The patch iterates over archive members and resolves their absolute paths to ensure they remain strictly within the boundaries of the designated extraction directory.
# Patched implementation logic (conceptual equivalent of v5.2.2 fix)
import os
def is_within_directory(directory, target):
abs_directory = os.path.abspath(directory)
abs_target = os.path.abspath(target)
prefix = os.path.commonpath([abs_directory])
return os.path.commonpath([abs_directory, abs_target]) == prefix
# ... Inside extraction logic ...
with opener(path, mode) as f:
if isinstance(f, tarfile.TarFile):
for member in f.getmembers():
member_path = os.path.join(to, member.name)
if not is_within_directory(to, member_path):
raise Exception("Attempted Path Traversal in Tar File")
f.extractall(path=to)The is_within_directory function uses os.path.commonpath to determine if the normalized target path shares the exact base directory prefix as the intended extraction folder. This securely neutralizes any ../ sequences by evaluating their final resolved location rather than attempting to filter the strings directly.
Exploitation relies on generating a structured archive where the name attribute of one or more file headers contains traversal sequences. Standard archive creation tools (like the tar command-line utility) often strip absolute paths and ../ sequences by default to protect users. Therefore, attackers utilize programmatic methods to construct the archives.
The following Python script demonstrates the creation of a malicious TAR archive containing a payload designed to escape the extraction directory. The script uses the tarfile module to construct a TarInfo object with the name set to ../escape.txt.
import tarfile
import io
import os
# Generate a TAR file containing a member with path traversal
with tarfile.open("evil.tar", "w") as tar:
payload = tarfile.TarInfo(name="../escape.txt")
content = b"Path Traversal Success!"
payload.size = len(content)
tar.addfile(payload, io.BytesIO(content))
print("[+] evil.tar created.")When a victim uses gdown to download and extract this file, the payload executes. For example, running python3 -c "from gdown import extractall; extractall('evil.tar', to='./safe_target/subfolder')" will cause escape.txt to be written directly into ./safe_target/, bypassing the subfolder boundary.
The primary impact of this vulnerability is arbitrary file overwrite within the context of the user executing the gdown script. The CVSS v3.1 vector evaluates to CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N with a base score of 6.5. The required user interaction prevents this from being a zero-click remote exploit, but the high integrity impact reflects the severity of the filesystem alteration.
An attacker can leverage arbitrary file overwrite to escalate privileges or establish persistence. By overwriting configuration files such as ~/.bashrc or ~/.zshrc, an attacker guarantees code execution the next time the victim opens a terminal. Similarly, overwriting ~/.ssh/authorized_keys grants immediate remote SSH access to the victim's machine.
In automated environments, such as CI/CD pipelines that use gdown to pull model weights or datasets, overwriting Python library files in the virtual environment (.venv/lib/python3.x/site-packages/...) directly leads to remote code execution during subsequent pipeline stages.
The authoritative remediation is to upgrade the gdown package to version 5.2.2 or later. This version introduces robust path containment validation before yielding files to the standard library extractors. Organizations managing Python dependencies via requirements.txt or Pipfile should explicitly pin gdown>=5.2.2.
For systems where immediate patching is not feasible, developers can mitigate the risk by operating gdown within tightly scoped, ephemeral environments with strict filesystem boundaries. Using Docker containers or specific chroot jails prevents traversal sequences from reaching critical system directories or user profiles.
Security teams can monitor for exploitation by implementing network-level inspection. Intrusion Detection Systems (IDS) can scan unencrypted HTTP downloads for ZIP or TAR file headers containing hex-encoded patterns matching ../ or absolute root paths. Furthermore, endpoint detection systems can flag execution behaviors where a Python process unexpectedly writes to SSH configuration directories or shell profile files.
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N| Product | Affected Versions | Fixed Version |
|---|---|---|
gdown wkentaro | < 5.2.2 | 5.2.2 |
| Attribute | Detail |
|---|---|
| Vulnerability Type | Path Traversal (CWE-22) |
| Attack Vector | Network |
| CVSS v3.1 Score | 6.5 (Medium) |
| Privileges Required | None |
| User Interaction | Required |
| Exploit Maturity | Proof of Concept Available |
| Impact | Arbitrary File Overwrite / Potential RCE |
The software uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the software does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.