Jun 22, 2026·6 min read·7 visits
A Cross-Site Scripting (XSS) vulnerability in the Appium Model Context Protocol (MCP) server allows unescaped layout metadata to execute malicious JavaScript in the client's inspector WebView, leading to arbitrary host command execution via postMessage exploitation.
GHSA-X975-RGX4-5FH4 is a high-severity Cross-Site Scripting (XSS) vulnerability residing in the Model Context Protocol (MCP) User Interface (UI) component of appium-mcp, an NPM package integrating Appium with MCP clients. The flaw exists within the createLocatorGeneratorUI utility function, which renders UI metadata directly into an HTML template page without performing sanitization or encoding. Because MCP clients use window.parent.postMessage to send commands from the UI to the host, this XSS can be escalated to trigger arbitrary MCP tool calls, potentially leading to Remote Code Execution (RCE) on the host running the MCP client.
The appium-mcp package provides an integration layer that allows Model Context Protocol (MCP) clients to interact with Appium servers. This integration exposes automated testing functions to external clients and LLM agents. Part of this interface includes a User Interface (UI) component designed to assist users in identifying and generating element locators.
Within this UI component, the utility function createLocatorGeneratorUI compiles layout metadata retrieved from an active Appium session into an interactive HTML dashboard. This interface is loaded within an iframe or desktop application WebView wrapper. Because the metadata represents live element parameters, it is subject to external control from the application under test.
An architectural security boundaries failure occurs when the package parses unencoded layout attributes into the generated HTML document. The vulnerability, tracked as GHSA-X975-RGX4-5FH4, belongs to the Improper Neutralization of Input During Web Page Generation class (CWE-79). If an application under test returns malicious UI parameters, the payload executes scripting in the host UI context.
The core weakness lies in src/ui/mcp-ui-utils.ts inside the createLocatorGeneratorUI function. When generating the locator cards, the function directly injects dynamic element properties like tagName, text, contentDesc, and resourceId using standard ES6 template string interpolation. This mechanism assumes that Appium driver node data is implicitly trusted and well-formed.
Appium nodes are populated dynamically by parsing the target application layout XML or HTML source code. Attackers who control the application interface can populate these fields with arbitrary text. Because the template generator does not sanitize or escape HTML entities, any HTML tag or executable payload embedded in these attributes is rendered directly into the DOM tree of the local inspector.
A secondary, compounding flaw exists in how the test button attributes are structured. The template attempts to construct an inline JavaScript event handler using the onclick attribute. Although the code attempts to escape backticks within the selector string using a basic regular expression replace function, it neglects other metacharacters like single quotes and double quotes, facilitating attribute breakout.
Comparing the vulnerable implementation with the patched version reveals how the authors resolved both injection vectors. In the vulnerable code, the template directly rendered property values like element.text and element.resourceId inside HTML block tags. The patch introduces a dedicated helper named escapeHtml that translates HTML meta-characters into safe character entity references.
The escaping routine targets characters key to XML and HTML parsing, specifically mapping ampersand, less-than, greater-than, double-quote, and single-quote characters to their respective entity names. By executing escapeHtml(element.text) and wrapping other fields similarly, any potential markup payload is forced to render strictly as static text content rather than active executable structures.
Additionally, the patch removes the inline onclick handler completely. Rather than attempting complex and brittle regex escaping of the strategy and selector inside string literals, the new implementation saves these variables inside HTML5 data attributes. It then implements a centralized, secure event delegation handler registered at the document scope.
Below is the technical code analysis contrasting the vulnerable string interpolation patterns with the secure entity-escaped model:
// VULNERABLE APPROACH IN src/ui/mcp-ui-utils.ts
${element.text ? `<p class="element-text"><strong>Text:</strong> ${element.text}</p>` : ''}
<button class="test-btn" onclick="testLocator('${strategy}', \`${selector.replace(/`/g, '\\`')}\`)">Test</button>
// SECURE RESOLUTON IN v1.85.10
${element.text ? `<p class="element-text"><strong>Text:</strong> ${escapeHtml(element.text)}</p>` : ''}
<button class="test-btn" data-strategy="${escapeHtml(strategy)}" data-selector="${escapeHtml(selector)}">Test</button>The centralized document event listener processes clicks on elements matching .test-btn and queries datasets natively, precluding javascript runtime execution during the compilation phase.
Exploitation of GHSA-X975-RGX4-5FH4 relies on an attacker injecting HTML formatting elements into the UI tree of an active application being automated or scanned. For example, a web page or mobile screen designed for user submission could include a text field containing an image payload. When Appium extracts this hierarchy via standard API actions, the server receives the payload.
Once the payload is stored inside the local page source or element properties, an automated testing tool or LLM client calls the Appium MCP locator generation interface. The package compiles the locator generator UI on the server side and transmits the unescaped raw string structure to the client rendering frame.
During rendering inside the WebView component, the browser DOM parser encounters the unescaped tags. A target image tag such as <img src="invalid_path" onerror="[payload]"> fires immediately due to loading failure. The script execution environment is identical to that of the locator interface, which contains active postMessage configurations mapping parent execution bindings.
The following system workflow diagram visualizes the payload injection lifecycle, showing how an untrusted target application exploits the client inspector environment:
The technical impact of this cross-site scripting flaw extends far beyond standard DOM manipulation. Because Model Context Protocol clients are intended to link natural language systems or clients to local host execution environments, they register helper tools that can write files, execute scripts, and invoke binary packages.
If the client interface (such as Claude Desktop or specialized test automation clients) exposes the locator generator within an embedded frame, the parent document frequently processes frame event commands. An injected payload can issue structured postMessage frames to trigger registered client tool schemas.
This transition shifts the impact from standard client-side browser context manipulation to local server-side access. Since MCP client setups are authorized to run shell tools with administrative privileges on local workstations, execution of the frame-level payload can lead directly to remote execution of unauthorized shell instructions.
The primary corrective measure is upgrading the appium-mcp package to version v1.85.10 or higher, which replaces vulnerable dynamic string layouts with the escaped DOM property format. The update can be integrated via npm using standard package updates.
In environments where an immediate package update is not feasible, security administrators must implement rigid Content Security Policy (CSP) headers within the rendering environment of the MCP UI wrapper. Restricting script execution by disallowing unsafe-inline directives within the layout viewer iframe will disable inline element event evaluation.
Additionally, developers must review cross-origin communication models in their MCP host environments. Direct execution of administrative tools triggered through cross-frame postMessage requests must be audited. Implementing rigorous origin validation and requiring direct human verification for critical host tool executions mitigates sandbox escape paths.
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H| Product | Affected Versions | Fixed Version |
|---|---|---|
appium-mcp Appium | < 1.85.10 | 1.85.10 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-79 |
| Attack Vector | Network |
| CVSS Score | 8.2 (High) |
| Exploit Status | Proof-of-Concept |
| KEV Status | Not Listed |
| Impact | Client-Side Script Execution / Host Command Injection via postMessage |
The software does not sanitize or escape user-controlled input before embedding it in HTML pages, allowing attackers to execute arbitrary JavaScript in the user's browser context.
A critical vulnerability exists in the stigmem-node package when running the opt-in stigmem-plugin-multi-tenant plugin. Due to a failure to enforce tenant-scoping filters on database queries within the decay sweep, quarantine moderation, and right-to-be-forgotten (RTBF) subsystems, an authorized caller belonging to one tenant can access, modify, and delete facts belonging to all other tenants. This broken object level authorization (BOLA) vulnerability allows cross-tenant data manipulation and information leakage.
An origin validation error and cross-site request forgery vulnerability in @zenalexa/unicli prior to version 0.225.2 allows cross-origin web applications to execute arbitrary tools on a user's local machine via the legacy stateless HTTP transport.
EverOS versions 1.0.0 and earlier contain a path traversal vulnerability in the user memory ingestion endpoint. By exploiting this flaw, unauthenticated network attackers can escape the designated database memory root and write arbitrary Markdown files to target directories on the local system.
An Insecure Direct Object Reference (IDOR) and missing authorization flaw in OpenRemote Manager allows an authenticated, low-privilege multi-tenant user to execute cross-realm bulk alarm deletion, resulting in permanent destruction of safety-critical alarms belonging to other tenants.
An insecure file extraction vulnerability exists in the UbuntuCorpusTrainer component of the ChatterBot package. Due to a combination of a predictable download path, a check-then-create directory pattern, and unvalidated symbolic link resolution during archive extraction, local attackers can write arbitrary files to restricted filesystem paths.
Anki Desktop for Windows, macOS, and Linux is vulnerable to local file disclosure and data exfiltration due to an iframe-based Same-Origin Policy (SOP) bypass. Maliciously crafted user scripts inside imported deck files run within the localhost context, bypassing security filters to query internal endpoints and read arbitrary system files.