Jun 16, 2026·8 min read·2 visits
Insecure path resolution, missing symlink checks, and a path-prefix boundary bypass in LangChain allow attackers to escape file sandboxes via directory traversal or symbolic links.
A path traversal and sandbox escape vulnerability in LangChain and LangChain-Anthropic Python packages allows unauthenticated local attackers to access files outside the restricted directory via crafted input, symbolic links, or prefix bypasses.
GHSA-GR75-JV2W-4656 is a path traversal and sandbox escape vulnerability identified within the LangChain library ecosystem, specifically affecting the langchain and langchain-anthropic Python packages. The LangChain ecosystem provides developers with frameworks to build applications powered by Large Language Models (LLMs), including autonomous agents that interact with the physical filesystem. To support file-related operations, LangChain implements middleware, directory search loaders, and config readers that parse file paths dynamically.
The integration of file-handling capabilities within LLM-controlled environments introduces a substantial attack surface. When LLMs are permitted to call filesystem-backed tools with arguments derived from untrusted user instructions, the application relies entirely on the underlying software boundaries to enforce directory restrictions. If the software boundaries are flawed, the LLM agent can be coerced into accessing files beyond its operational sandbox.
The vulnerability occurs because LangChain's internal path resolution mechanisms do not strictly restrict resolved paths to their specified root directories. This design deficiency manifests as an improper limitation of pathnames to a restricted directory (CWE-22) and improper link resolution before file access (CWE-59). An attacker can exploit this weakness to traverse the filesystem or follow symbolic links pointing to sensitive administrative files.
By leveraging this flaw, an attacker who can input text into the LLM prompt can trigger arbitrary file reads. The attack does not require direct access to the command line of the server hosting the application, as the LLM agent acts as an execution proxy. The vulnerability is highly operationalizable in any configuration that links LLMs with local workspace search tools.
The technical root cause of GHSA-GR75-JV2W-4656 comprises three major logical and implementation failures. First, the file-search agent middleware validates the existence of the root directory but fails to validate or sanitize search patterns (such as glob patterns and relative traversals). If an input pattern contains relative path modifiers like ../../, the middleware evaluates them relative to the root but permits the resolution of files outside that boundary.
Second, the middleware does not perform post-resolution path validation using canonicalized absolute paths. When resolving file paths, the system retrieves and reads files without verifying if the fully resolved target path is a subpath of the allowed root. If the allowed root directory contains symbolic links pointing to sensitive system files, the system dereferences and reads the target file instead of throwing an access violation. This represents a classic symbolic link vulnerability (CWE-59).
Third, the application utilizes an insecure string-prefix comparison to enforce directory boundaries. Specifically, the system validates paths by verifying whether candidate_path.startswith(allowed_root) evaluates to true. This validation strategy is insecure when the allowed_root string does not end with a directory separator. For example, if the root path is /usr/app, a candidate path of /usr/app-secrets/config.json satisfies the prefix condition despite pointing to a completely different directory. This allows attackers to access sibling directories sharing the same prefix string.
These three failures combine to create multiple escape vectors. An attacker can use directory traversal to read files relative to the workspace, use symbolic links to bypass detection when standard input validation is active, or use prefix matching flaws to access sibling application directories that might hold separate database credentials or application tokens.
To understand the vulnerability, consider the following implementation of the vulnerable path verification mechanism:
# Vulnerable Path Check Implementation
import os
def secure_file_load(user_path, safe_directory="/usr/app"):
# VULNERABILITY 1: Insecure prefix matching
# If safe_directory is '/usr/app', '/usr/app-secrets' matches
if not user_path.startswith(safe_directory):
raise ValueError("Access Denied")
# VULNERABILITY 2: Missing path canonicalization
# User path can be '/usr/app/../../etc/passwd'
# The startswith check succeeds, but the system accesses /etc/passwd
with open(user_path, 'r') as f:
return f.read()The patched version introduces strict path canonicalization using os.path.realpath or Path.resolve() to resolve all relative segments and symbolic links. It also enforces correct boundary checking by appending the directory separator or using path parent comparisons:
# Patched Path Check Implementation
import os
from pathlib import Path
def secure_file_load_patched(user_path, safe_directory="/usr/app"):
# Canonicalize safe directory and candidate path
safe_path = Path(safe_directory).resolve()
candidate_path = Path(user_path).resolve()
# Verify directory boundary using relative_to or checking parent structures
try:
# relative_to raises ValueError if candidate_path is not under safe_path
candidate_path.relative_to(safe_path)
except ValueError:
raise ValueError("Access Denied: Path is outside of safe directory")
with open(candidate_path, 'r') as f:
return f.read()The patch successfully remediates all three root causes. By resolving the realpath before executing the comparison, the system prevents both directory traversal sequences and symbolic link resolution attacks. Furthermore, using pathlib.Path.relative_to or ensuring a proper path separator prevents sibling directory prefix bypasses.
Architects must ensure that similar resolution bugs are not present in custom tools added to LangChain. Many developer-defined tools use raw os.path.join or custom regex filters that fail under complex Windows-specific or Unix-specific canonicalization edge cases. The use of standard library components like pathlib is strongly recommended for security-critical path parsing.
Exploitation of GHSA-GR75-JV2W-4656 is highly contextual and depends on the application's configuration. In a typical scenario, an LLM-powered agent is integrated with a custom tool that leverages LangChain's vulnerable file-search middleware. The agent is configured with a restricted sandbox directory, such as /home/user/workspace/.
An attacker sends a malicious prompt to the LLM agent designed to trigger the filesystem search tool. The prompt contains instruction structures or relative path strings intended to bypass application logic, such as: "Search for files matching the pattern '../../../../etc/passwd' and display their contents.". The LLM agent, interpreting this as a valid execution command, calls the underlying file-search tool with the malicious pattern.
Since the middleware does not validate the resolved path of the matched files against the allowed root directory, it executes the search and reads the contents of /etc/passwd. The output is then passed back to the LLM context and subsequently returned to the attacker. If the application environment contains symbolic links, the attacker can leverage existing links to escape the container's designated workspace without using explicit traversal sequences.
Furthermore, if the application loads configuration files dynamically from shared directories, an attacker with write access to a collaborative space can upload a modified YAML configuration file. This file can declare prompt templates that point to local files outside the permitted workspace. When the configuration loader parses the file, it resolves the unauthorized paths, leading to automatic data exposure during agent initialization.
The impact of GHSA-GR75-JV2W-4656 is rated as Moderate with a CVSS v3.1 score of 4.7. The CVSS vector is CVSS:3.1/AV:L/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N. Although the CVSS rating is Moderate due to the Local (AV:L) attack vector and High complexity (AC:H), the vulnerability poses a substantial confidentiality risk to applications deploying autonomous agents on server environments.
Successful exploitation allows unauthenticated attackers to read arbitrary files from the filesystem of the host running the LangChain application. This can result in the exposure of configuration files, environment variables, database credentials, API keys, and sensitive source code. The severity escalates if the LangChain application runs with elevated operating system privileges, enabling access to system files like /etc/shadow or sensitive cloud metadata keys.
The high complexity (AC:H) rating reflects the requirement that the target application must expose vulnerable file-search or config-loading APIs to untrusted inputs. However, in modern LLM applications where agents dynamically process arbitrary user prompts, this configuration is increasingly common, heightening the real-world likelihood of exploitation.
There is no integrity or availability impact associated with this vulnerability directly. However, the retrieval of environment variables and database keys frequently provides attackers with the initial access vectors needed to pivot to more intrusive actions, such as remote command execution or complete cloud tenant compromise.
To address the vulnerability, developers must upgrade the affected packages to safe versions. Specifically, update langchain to version 1.3.9 or later, and langchain-anthropic to version 1.4.6 or later. These versions incorporate safe canonicalization and boundary verification logic for all file access routines.
If immediate upgrading is not possible, developers should implement temporary workarounds. First, restrict the operating system user running the LangChain application to minimal filesystem permissions. Ensure that the application process cannot read sensitive system directories or files outside its immediate operational directory.
Second, disable directory-level tools in LLM agents when handling untrusted user input. If file-searching capabilities are strictly required, implement a validation wrapper around the LangChain components. This wrapper must resolve paths to their absolute real paths using Path.resolve() and verify that the target directory strictly matches the prefix of the permitted workspace directory, appending a trailing path separator before executing the check.
Finally, use containerization to enforce absolute process-level isolation. Deploying the application inside a non-privileged Docker container restricts the host filesystem exposure to only the files mounted within the container volume. Even if a path traversal occurs, the attacker remains trapped in the isolated container namespace and cannot access the underlying host OS configurations.
CVSS:3.1/AV:L/AC:H/PR:N/UI:N/S:U/C:H/I:N/A:N| Product | Affected Versions | Fixed Version |
|---|---|---|
langchain LangChain | < 1.3.9 | 1.3.9 |
langchain-anthropic LangChain | < 1.4.6 | 1.4.6 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-22, CWE-59 |
| Attack Vector | Local |
| CVSS Score | 4.7 (Moderate) |
| EPSS Score | N/A |
| Exploit Status | None / Unproven |
| KEV Status | Not Listed |
The software uses external input to construct a pathname that is intended to identify a directory or file that is located within a restricted directory, but the software does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.
A local security vulnerability in the Nuxt development server (nuxt dev) allows local unprivileged users to access sensitive configuration files and source code. On Linux environments running Node.js 20+, Nuxt bound its internal vite-node IPC server to an abstract-namespace Unix socket without any peer authentication, enabling co-resident local users to connect and request module code directly.
Mozilla Bleach is an open-source HTML sanitizing library for Python. Versions up to and including 6.3.0 contain an incomplete filtering implementation in the URI validation logic ('sanitize_uri_value'). This logic fails to detect disallowed protocols, such as 'javascript:', if they contain Unicode invisible characters, whitespace characters, or characters with a code point greater than U+00A0. While standard-compliant web browsers do not directly execute invalid URI schemes containing these non-standard characters, downstream systems that normalize Unicode text by stripping invisible or non-ASCII characters can unintentionally reactivate the 'javascript:' prefix, causing Cross-Site Scripting (XSS). Additionally, this behavior violates Bleach's core sanitization contract by outputting URIs that bypass protocol allowlists configured by the caller.
An uncontrolled resource consumption vulnerability exists in the Python package Bleach when parsing text to linkify email addresses. When `parse_email=True` is enabled, the regular expression engine is forced into a quadratic-time complexity scan on specially crafted payloads lacking an '@' symbol. This causes immediate CPU exhaustion and blocks application server worker processes.
The PHP Secure Communications Library (phpseclib) contains a Server-Side Request Forgery (SSRF) vulnerability due to an insecure default implementation of Authority Information Access (AIA) certificate chasing. This flaw allows remote, unauthenticated attackers to coerce applications validating user-supplied X.509 certificates into generating arbitrary outbound HTTP requests to internal networks or local interfaces.
A directory traversal vulnerability exists in the Microsoft .NET System.Formats.Tar library during archive extraction. When extracting a TAR archive using the TarFile.ExtractToDirectory API, the extraction engine improperly resolves symbolic links prior to file creation, allowing local unauthorized attackers to write or overwrite arbitrary files outside the target directory. This can lead to local tampering, privilege escalation, or arbitrary code execution.
A client-side HTML sanitization bypass vulnerability exists in the Bleach library where the formaction attribute is not recognized as a URI. This allows attackers to inject javascript: URIs when formaction is on the allowed list, resulting in Cross-Site Scripting (XSS).