The root cause of this vulnerability lies in a feature introduced in justhtml version 1.9.1, designed to improve round-trip parsing fidelity. The maintainers updated the HTML serialization logic to preserve the literal text content of raw-text elements, specifically script and style. This change prevented the serializer from converting necessary syntax characters, such as >, into HTML entities like >, which would otherwise break the intended script or stylesheet logic.

The implementation of this feature relied on a hardcoded set of elements (_LITERAL_TEXT_SERIALIZATION_ELEMENTS) and a function named _serialize_text_for_parent. When the serializer processes a text node, it checks the identity of the parent element. If the parent is a script or style tag, the function returns the text unmodified, bypassing the standard _escape_text routine applied to all other elements.

This exact preservation of text creates a critical desynchronization between the server-side DOM representation and the client-side browser parsing behavior. In the HTML specification, a browser's HTML parser terminates a script context immediately upon encountering the first </script> sequence, regardless of the document structure established by the server. The justhtml serializer failed to account for this client-side mutation behavior.

Because the serializer outputs the text node literally, an attacker can embed a premature closing tag followed by a new, executable HTML element. The server correctly models this as a single string within a script tag, but the browser parses it as the end of the script block followed by a new, active DOM element.

_LITERAL_TEXT_SERIALIZATION_ELEMENTS = frozenset({"script", "style"}) def _serialize_text_for_parent(text: str | None, parent_name: str | None) -> str: if not text: return "" # If the parent is a script or style tag, the text is emitted literally if parent_name in _LITERAL_TEXT_SERIALIZATION_ELEMENTS: return text # Other elements have their text escaped (e.g., <div>, <span>) return _escape_text(text)

from justhtml import JustHTML, SanitizationPolicy # 1. Define a custom policy that allows script or style tags policy = SanitizationPolicy( allowed_tags=["script", "body"], allowed_attributes={} ) # 2. Payload designed to break out of the script tag xss_payload = "alert('injected'); </script><svg onload=alert('broken-out')>" html_input = f"<body><script>{xss_payload}</script></body>" # 3. Sanitize and serialize sanitized_output = JustHTML(html_input, policy=policy).to_html() # 4. Observe the result (Vulnerable output shown below) print(sanitized_output) # <body><script>alert('injected'); </script><svg onload=alert('broken-out')></script></body>

Product

Affected Versions

Fixed Version

justhtml

EmilStenstrom

< 1.12.0

1.12.0

Attribute

Detail

CWE ID

CWE-79

Attack Vector

Network

CVSS Base Score

5.3

Exploit Status

PoC Available

Impact

Cross-Site Scripting (XSS)

Patch Status

Fixed in 1.12.0

GHSA-QVC2-MG72-JJHX

5.3

GHSA-qvc2-mg72-jjhx: Mutation XSS (mXSS) in justhtml HTML Serializer

Amit Schendel

Senior Security Researcher

Mar 19, 2026·7 min read·19 visits

PoC Available

Executive Summary (TL;DR)

justhtml versions prior to 1.12.0 are vulnerable to Mutation XSS when configured with custom sanitization policies allowing <script> or <style> tags, enabling arbitrary JavaScript execution via unescaped closing tags.

The justhtml Python library prior to version 1.12.0 contains a Cross-Site Scripting (XSS) vulnerability due to flawed HTML serialization logic. The serializer preserves the literal text content of raw-text elements like script and style to maintain round-trip fidelity. If an application uses a custom sanitization policy that permits these elements, an attacker can supply closing tag sequences to break out of the context and execute arbitrary JavaScript in the victim's browser.

Attack Flow Diagram

Vulnerability Overview

The justhtml Python package is a library utilized for building, sanitizing, and serializing HTML documents. Prior to version 1.12.0, the library contains a Mutation Cross-Site Scripting (mXSS) vulnerability within its HTML serialization component. The flaw is tracked under GitHub Advisory GHSA-qvc2-mg72-jjhx and carries a moderate severity rating with a CVSS 4.0 score of 5.3.

The vulnerability is not exposed under the library's default configuration. The default SanitizationPolicy automatically drops the contents of script and style elements, thereby mitigating the attack vector entirely. The issue strictly manifests when developers implement a custom sanitization policy that explicitly permits these raw-text elements.

When a custom policy allows script or style tags, the serializer fails to properly neutralize closing-tag sequences embedded within text nodes. An attacker can leverage this oversight to supply input that the server-side parser treats as a single text node, but the client-side browser interprets as a context breakout. This mismatch enables the execution of arbitrary JavaScript in the victim's browser context.

Root Cause Analysis

Code Analysis

The vulnerability is isolated to the text serialization routines within serialize.py. Prior to version 1.12.0, the code utilized a static set to identify elements that required literal text preservation. The standard text escaping mechanism was deliberately bypassed for these specific tags.

_LITERAL_TEXT_SERIALIZATION_ELEMENTS = frozenset({"script", "style"})
 
def _serialize_text_for_parent(text: str | None, parent_name: str | None) -> str:
    if not text:
        return ""
    # If the parent is a script or style tag, the text is emitted literally
    if parent_name in _LITERAL_TEXT_SERIALIZATION_ELEMENTS:
        return text
    # Other elements have their text escaped (e.g., <div>, <span>)
    return _escape_text(text)

The fix implemented in version 1.12.0 introduces two primary hardening measures to address the breakout vector. First, the serialization logic was updated to actively identify and neutralize closing-tag sequences (such as </script> or </style>) whenever they appear within the text nodes of raw-text elements. This neutralization ensures that the browser parser cannot prematurely terminate the context.

Second, the sanitization process was strictly hardened to drop any non-text children assigned to script and style tags. By enforcing that these elements can only contain valid text nodes, the maintainers eliminated the possibility of complex, nested DOM structures being mis-serialized into executable contexts. These dual measures effectively close the mXSS vector while maintaining the required round-trip fidelity.

Exploitation

Exploitation of this vulnerability requires a specific prerequisite: the target application must instantiate justhtml using a custom SanitizationPolicy that permits script or style tags. The attacker must then identify an input field or parameter processed by this configuration.

The attack payload is designed to initiate a valid script context, immediately break out of it using a literal closing tag, and append a subsequent payload. For example, an attacker supplies the string alert('injected'); </script><svg onload=alert('broken-out')>. The application processes this input and, due to the vulnerability, emits it exactly as provided without HTML entity encoding.

from justhtml import JustHTML, SanitizationPolicy
 
# 1. Define a custom policy that allows script or style tags
policy = SanitizationPolicy(
    allowed_tags=["script", "body"],
    allowed_attributes={}
)
 
# 2. Payload designed to break out of the script tag
xss_payload = "alert('injected'); </script><svg onload=alert('broken-out')>"
html_input = f"<body><script>{xss_payload}</script></body>"
 
# 3. Sanitize and serialize
sanitized_output = JustHTML(html_input, policy=policy).to_html()
 
# 4. Observe the result (Vulnerable output shown below)
print(sanitized_output)
# <body><script>alert('injected'); </script><svg onload=alert('broken-out')></script></body>

When the resulting HTML is rendered by a user's browser, the parser encounters the </script> tag and terminates the script context. It then processes the newly exposed <svg onload=alert('broken-out')> element, executing the embedded JavaScript. The final dangling </script> tag is typically ignored or discarded by the browser's error-handling routines.

Impact Assessment

The vulnerability carries a CVSS 4.0 vector string of CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N. This scoring reflects a network-based attack vector with low complexity, requiring no specialized access privileges but mandating user interaction. The vulnerable system itself suffers no direct confidentiality, integrity, or availability impact.

The concrete security impact is measured on the subsequent system, which is the web browser of the user viewing the serialized output. Successful exploitation grants the attacker the ability to execute arbitrary JavaScript within the context of the user's active session. This represents a classic Stored or Reflected Cross-Site Scripting (XSS) scenario, depending on how the application handles the input.

With JavaScript execution capabilities, an attacker can access sensitive session tokens, manipulate the Document Object Model (DOM), or perform actions on behalf of the authenticated user. Applications heavily reliant on custom HTML sanitization for user-generated content are at the highest risk if they explicitly allow raw-text elements.

The scope of the vulnerability is strictly confined to the client side. The server infrastructure executing the justhtml library remains entirely unaffected, as the flaw pertains exclusively to the generation of output that is misinterpreted by a downstream consumer.

Remediation

The primary remediation strategy is to upgrade the justhtml dependency to version 1.12.0 or later. This release introduces the necessary neutralization routines for closing-tag sequences and strictly enforces text-only children for raw-text elements. The update resolves the serialization flaw without requiring structural changes to the consuming application.

For environments where immediate patching is unfeasible, administrators must audit and modify their application code. Developers should inspect all instances of SanitizationPolicy and explicitly remove script and style from the allowed_tags parameter. Reverting to the default sanitization behavior entirely eliminates the attack surface.

Developers should review the official JustHTML Sanitization Documentation for comprehensive guidance on secure, context-aware serialization. Maintaining an aggressive deny-by-default posture for raw-text elements is a standard defense-in-depth practice for HTML sanitizers.

Technical Appendix

CVSS Score

5.3/ 10

CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N

Affected Systems

justhtml HTML serialization libraryPython applications utilizing custom justhtml SanitizationPolicy configurations

Affected Versions Detail

Product	Affected Versions	Fixed Version
justhtml EmilStenstrom	< 1.12.0	1.12.0

Attribute	Detail
CWE ID	CWE-79
Attack Vector	Network
CVSS Base Score	5.3
Exploit Status	PoC Available
Impact	Cross-Site Scripting (XSS)
Patch Status	Fixed in 1.12.0

MITRE ATT&CK Mapping

T1190Exploit Public-Facing Application

Initial Access

CWE-79

Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

The software does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.

Vulnerability Timeline

Version 1.3.0 released, hardening <noscript> handling against differential parsing mXSS.

2026-01-28

Version 1.9.1 released, introducing literal preservation in script/style elements, creating the mXSS vulnerability under custom policies.

2026-03-10

Version 1.12.0 released, fixing the mXSS breakout vulnerability.

2026-03-17

GitHub Advisory GHSA-qvc2-mg72-jjhx published.

2026-03-18