Print to Pwned: The DOM XSS Inside html2pdf.js
Jan 14, 2026·5 min read
Executive Summary (TL;DR)
The html2pdf.js library (versions < 0.14.0) attempted to sanitize HTML input by manually removing <script> tags *after* adding them to the DOM. This 'blacklist' approach failed spectacularly because inline event handlers (like onerror) execute immediately upon parsing. The fix involves implementing DOMPurify to whitelist safe HTML before it ever touches the document.
A high-severity Cross-Site Scripting (XSS) vulnerability in the popular html2pdf.js library allows attackers to execute arbitrary JavaScript by injecting malicious HTML strings during PDF generation.
The Hook: Client-Side PDF Generation
We all love client-side libraries. They offload work from the server, they're snappy, and usually, they're isolated enough to feel safe. html2pdf.js is the darling of this category, allowing developers to turn messy HTML invoices, reports, and tickets into crisp PDFs directly in the browser.
The premise is simple: you feed it a DOM element or a raw HTML string, and it spits out a PDF. But here lies the trap. When developers use the from() method with a string—perhaps user-generated content like a comment, a profile bio, or a company name—they expect the library to handle it safely.
They were wrong. Prior to version 0.14.0, passing a string to html2pdf.js was essentially asking the browser to render untrusted code. It’s the classic scenario: a developer trusts a library to handle the heavy lifting, only to realize the library is lifting the attacker's payload right into the execution context.
The Flaw: The Blacklist Fallacy
The vulnerability stems from a fundamental misunderstanding of how browsers parse HTML. The library developers knew that dumping raw HTML into the DOM was dangerous, so they tried to implement a safety filter.
Their strategy? Take the input string, shove it into an element's innerHTML, and then iterate through the DOM to find and remove <script> tags. Ideally, this sounds like it cleans up the mess. Practically, it’s like letting a burglar into your house and then asking them to leave after they've already stolen your TV.
The browser's parser is faster than the JavaScript cleanup loop. The moment innerHTML is assigned, the browser parses the string. If it encounters an image tag with a broken source and an onerror handler, it fires the event handler immediately. By the time the library's code moves to the next line to look for <script> tags, the malicious JavaScript has already executed. The damage is done.
The Code: Anatomy of a Screw-up
Let's look at the smoking gun in src/utils.js. This is a textbook example of insecure DOM handling.
The Vulnerable Code (< 0.14.0):
export const createElement = function createElement(tagName, opt) {
var el = document.createElement(tagName);
if (opt.className) el.className = opt.className;
if (opt.innerHTML) {
// [!] FATAL ERROR: Assignment triggers immediate parsing
el.innerHTML = opt.innerHTML;
// [!] TOO LATE: The code below tries to clean up, but the payload has executed.
var scripts = el.getElementsByTagName('script');
for (var i = scripts.length; i-- > 0; null) {
scripts[i].parentNode.removeChild(scripts[i]);
}
}
return el;
};The fix involved completely ripping out this naive logic and replacing it with a battle-tested sanitizer. The developers brought in DOMPurify to ensure the HTML is clean before it touches the DOM.
The Fixed Code (v0.14.0):
import DOMPurify from 'dompurify';
export const createElement = function createElement(tagName, opt) {
var el = document.createElement(tagName);
if (opt.className) el.className = opt.className;
if (opt.innerHTML) {
// [✓] SECURE: Input is sanitized against a whitelist before parsing
el.innerHTML = DOMPurify.sanitize(opt.innerHTML);
}
// ...
};The Exploit: Weaponizing the Print Button
Exploiting this is trivially easy for anyone who understands XSS. We don't need complex memory corruption; we just need a standard HTML vector that doesn't rely on the <script> tag.
Imagine an invoicing application where the user can set their "Company Name". The application renders this invoice to a PDF using html2pdf.js.
The Attack Vector:
- Input: The attacker sets their company name to:
<img src=x onerror=alert(document.cookie)>. - Trigger: The victim (admin) opens the invoice and clicks "Download PDF".
- Execution:
html2pdftakes the string, setsinnerHTML, and the browser immediately tries to load the imagex. It fails, triggeringonerror, and pops the alert (or exfiltrates the admin's session token).
Proof of Concept:
// This simulates the library call with malicious input
import html2pdf from 'html2pdf.js';
const maliciousPayload = '<svg/onload=alert("You_are_pwned")>';
// When this runs, the alert fires instantly
html2pdf().from(maliciousPayload).save();Because the library only looked for <script> tags in previous versions, this SVG vector walks right past the security check.
The Fix: Mitigation & Cleanup
The only reliable fix is to upgrade html2pdf.js to version 0.14.0 or later. This version introduces DOMPurify as a hard dependency, which uses a strict whitelist to strip out dangerous tags and attributes (like onerror, onload, iframe, etc.) while preserving the visual structure needed for the PDF.
If you absolutely cannot upgrade immediately (why?), you must manually sanitize any string input before passing it to html2pdf().from(). Do not rely on your own regex; use a library like DOMPurify yourself.
[!NOTE] This vulnerability highlights why "Sanitization" is distinct from "Validation". You can validate input all day, but if the sink (the place where data is used) is insecure, you are still vulnerable. Always sanitize at the boundary, preferably right before the sink.
Official Patches
Fix Analysis (1)
Technical Appendix
CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:H/VI:H/VA:L/SC:N/SI:N/SA:NAffected Systems
Affected Versions Detail
| Product | Affected Versions | Fixed Version |
|---|---|---|
html2pdf.js eKoopmans | < 0.14.0 | 0.14.0 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-79 (Improper Neutralization of Input During Web Page Generation) |
| Attack Vector | Network (Client-Side) |
| CVSS v4.0 | 8.7 (High) |
| Impact | High Confidentiality, High Integrity |
| Exploit Status | PoC Available / Trivial |
| Affected Component | src/utils.js:createElement |
MITRE ATT&CK Mapping
The software does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.
Known Exploits & Detection
Vulnerability Timeline
Subscribe to updates
Get the latest CVE analysis reports delivered to your inbox.