Jun 21, 2026·6 min read·2 visits
Type confusion bypasses filesystem safeguards in LangSmith SDK TracingMiddleware, allowing remote attackers to silently exfiltrate server files to the telemetry dashboard.
The LangSmith Python SDK TracingMiddleware is vulnerable to an arbitrary server-side file read. Due to origin validation and type confusion flaws, external inputs parsed from distributed tracing headers bypass local filesystem read protections, allowing remote attackers to silently exfiltrate arbitrary server files to the telemetry dashboard.
The langsmith package prior to version 0.8.18 contains a critical vulnerability in its telemetry client infrastructure. The flaw exists specifically within the TracingMiddleware component, which handles incoming distributed tracing context over HTTP. This component parses metadata and attributes associated with execution spans and trace runs.
The primary attack surface involves the processing of run attributes supplied via HTTP headers. Because the middleware does not validate the source or integrity of these trace-propagation inputs, it allows remote, unauthenticated attackers to inject arbitrary attributes. This input validation failure enables the insertion of unauthorized file attachment metadata into the trace run context.
This vulnerability is classified as a combination of Origin Validation Error (CWE-346) and Type Confusion (CWE-843). By exploiting these flaws, an attacker can manipulate the background trace-upload mechanism to read arbitrary files from the server's local file system. These files are subsequently exfiltrated to the destination LangSmith workspace without user interaction.
The root cause of the vulnerability lies in the mechanism used by the LangSmith Python SDK to handle attachments. Under normal operation, users can associate local file attachments with trace runs. To prevent arbitrary local file reads, the SDK implements a protection mechanism that is controlled by the dangerously_allow_filesystem configuration flag.
In vulnerable versions of the SDK, the validation logic in the create_run, multipart_ingest, and update_run functions explicitly checks whether an attachment is structured as a tuple, and whether its second element is an instance of the pathlib.Path class. The intent of this check is to intercept and block file paths when dangerously_allow_filesystem is disabled.
However, when trace headers are transmitted over HTTP, they are serialized as JSON payloads. The JSON deserializer converts JSON arrays into Python list objects and JSON strings into Python str objects. Because of this type mismatch, an attacker-supplied payload like {"attachments": {"leak": ["text/plain", "/etc/passwd"]}} bypasses the type validation.
The check isinstance(attachment, tuple) evaluates to False because the deserialized attachment is a list. Similarly, isinstance(attachment[1], Path) evaluates to False because the deserialized path is a str. Consequently, the validation block fails to raise a ValueError and execution proceeds to the serialization phase.
During the serialization phase, the background tracing thread attempts to prepare the attachment for upload. The serializer opens any attachment that is not raw inline bytes, treating the value as a file system path. It performs a standard open(attachment_path, "rb") operation on the string path, reads the file, and prepares it for transmission.
An examination of the vulnerable implementation versus the patched implementation demonstrates how the type confusion was resolved. The old verification logic was distributed across multiple client methods and relied on narrow type checks.
Before the fix was implemented, the validation blocks in the SDK client checked for explicit tuple and pathlib.Path types. The validation check was written as follows:
# Vulnerable Check (Before Patch)
if run_create.get("attachments") is not None:
for attachment in run_create["attachments"].values():
if (
isinstance(attachment, tuple)
and isinstance(attachment[1], Path)
and not dangerously_allow_filesystem
):
raise ValueError(
"Must set dangerously_allow_filesystem=True to allow passing in Paths for attachments."
)This logic is structurally weak because incoming JSON-deserialized data contains list and str types instead of tuple and Path types. The patched implementation shifts from class assertions to a robust evaluation of data content, moving the check to a centralized helper module:
# Patched Check (After Patch)
def _attachment_references_filesystem(attachment: Any) -> bool:
"""Return True if an attachment is a filesystem path rather than inline bytes.
Serialization opens any attachment data that isn't ``bytes`` as a file path
(see ``serialized_run_operation_to_multipart_parts_and_context``). Inline data
is ``bytes`` or a 2-element ``(content_type, bytes)``; anything else is treated
as a filesystem reference (fail closed). Unlike the old ``(tuple, Path)`` guard,
this also catches JSON/dict-shaped input (``list``/``str``).
"""
if isinstance(attachment, bytes):
return False
if isinstance(attachment, (tuple, list)) and len(attachment) == 2:
return not isinstance(attachment[1], bytes)
return TrueThe helper function _attachment_references_filesystem categorizes any attachment that is not explicit inline bytes as a file system reference. It evaluates JSON-shaped lists and strings accurately. This centralized logic is executed across all relevant operations via the _reject_filesystem_attachments helper.
Exploitation of this vulnerability requires network access to the target application's tracing endpoints and trace-propagation mechanisms. Because the input deserialization process occurs automatically when handling incoming middleware requests, no authentication is required to trigger the local file read.
The following diagram outlines the complete data exfiltration pipeline, showing how an external attacker triggers the internal file read and retrieves the data via the telemetry workspace:
An attacker crafts an HTTP request representing a tracing run containing an attachment pointing to a sensitive file path, such as /etc/passwd or .env. When the background worker serializes the trace run, it reads the target file under the privileges of the application process. The contents are subsequently transmitted to the configured LangSmith workspace, where they are accessible via the workspace dashboard.
The severity of this vulnerability is high, carrying a CVSS v3.1 base score of 7.7. The vector string is CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:N/A:N. The scope is changed because a vulnerability in the local SDK client results in data exposure on an external software-as-a-service platform.
The primary security consequence is the unauthorized exposure of sensitive local files. If the Python process runs with elevated privileges, an attacker can retrieve configuration credentials, API keys, private keys, and critical system database files. This bypasses local boundary controls and access control lists.
Furthermore, because the file extraction is executed automatically by the background tracing thread using legitimate API credentials, the exfiltration traffic resembles legitimate telemetry updates. This significantly reduces the likelihood of detection by traditional network-based intrusion detection systems.
The primary and recommended remediation is to upgrade the langsmith Python package to version 0.8.18 or higher. This update replaces the vulnerable type check with a robust validation function that blocks filesystem access by default unless dangerously_allow_filesystem is explicitly set to True.
For environments where an immediate upgrade is not feasible, organizations should implement the following mitigation strategies to reduce exposure:
Restrict public access to endpoints utilizing the TracingMiddleware component to prevent unauthorized tracing injections.
Conduct an audit of LangSmith workspace access permissions, removing low-privilege or inactive users to prevent unauthorized access to uploaded traces.
Implement host-based monitoring or Web Application Firewall (WAF) rules to detect and drop tracing propagation headers containing JSON array representations of file paths.
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:N/A:N| Product | Affected Versions | Fixed Version |
|---|---|---|
langsmith LangChain | < 0.8.18 | 0.8.18 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-843, CWE-346, CWE-22 |
| Attack Vector | Network |
| CVSS Score | 7.7 |
| Exploit Status | PoC Available |
| Impact | Arbitrary File Read & Exfiltration |
The product allocates or accesses a resource of one type, but later accesses that resource using a type that is incompatible with the original type.
Two critical use-after-free vulnerabilities exist within the Foreign Function Interface (FFI) layer of Cloudflare Quiche, affecting connection ID iterator functions. These flaws occur because raw pointers are returned to C callers pointing to temporary, owned Rust values that are immediately dropped and deallocated upon function exit. This leads to undefined behavior, potential limited heap information disclosure, or application crashes when integrating applications dereference these dangling pointers.
A command injection vulnerability exists in the .github/workflows/discord-issue.yml workflow of the gouef/githubtoplanguages repository. By exploiting literal string interpolation of untrusted issue titles into an inline Bash script, an attacker can execute arbitrary code within the GitHub Actions runner environment. This exposure risks the theft of repository secrets such as the Discord webhook URL.
The @jhb.software/payload-cloudinary-plugin exposes an endpoint that performs unvalidated cryptographic signing of Cloudinary API parameters, allowing authenticated users with minimal privileges to forge valid signatures for arbitrary actions. This flaw allows attackers to overwrite remote storage assets, execute unauthorized file uploads, alter asset visibility parameters, trigger SSRF webhooks, and perform directory traversal within Cloudinary repositories.
A Server-Side Request Forgery (SSRF) and Bearer Token Exfiltration vulnerability exists in the @merill/lokka (Lokka) Model Context Protocol (MCP) server prior to version 2.1.2. The server constructed Azure Resource Manager request URLs by concatenating user-controlled path parameters directly into destination request strings. By injecting authority-redefinition characters, an attacker can manipulate URL parsing to execute a host-escape attack, forcing the server to send high-privilege Azure Resource Manager (ARM) Bearer tokens to an external attacker-controlled host. This allows complete administrative access to the associated Azure subscriptions.
A directory traversal and symlink following vulnerability exists in Pydantic Settings when using the NestedSecretsSettingsSource with nested subdirectory lookups enabled. An attacker capable of writing to the secrets directory can bypass size limitations, read arbitrary host files, or cause a denial-of-service condition via cyclic symlinks.
A Server-Side Request Forgery (SSRF) vulnerability exists in SurrealDB's Identity & Access Management (IAM) module prior to version 3.1.5. When configuring JSON Web Key Set (JWKS) URLs for token verification, the remote fetcher follows HTTP redirects by default without validating redirect targets against configured network capabilities. This allows high-privileged users to bypass network access limits and perform blind port scanning of internal network resources.