May 14, 2026·5 min read·5 visits
Karakeep workers are vulnerable to SSRF via the metascraper-logo-favicon plugin, which autonomously probes internal network resources during HTML parsing.
A critical Server-Side Request Forgery (SSRF) vulnerability exists in the Karakeep metadata extraction process prior to version 0.32.0. The flaw allows attackers to bypass primary URL validation and target internal network resources or cloud metadata services via crafted webpage metadata.
Karakeep provides URL bookmarking capabilities that rely on worker processes to extract metadata from submitted links. These worker processes handle external resources asynchronously to improve application performance and parse rich media content.
The application utilizes the metascraper library to identify and extract data from HTML documents. Specifically, the metascraper-logo-favicon plugin processes the target page to locate icon assets suitable for display within the user interface.
A Server-Side Request Forgery (CWE-918) vulnerability exists in this extraction pipeline due to secondary HTTP requests generated by the plugin. These secondary requests bypass the primary URL validation mechanisms implemented by the core application.
Attackers exploit this flaw by supplying a malicious external URL that passes initial system checks. The worker process subsequently fetches unvalidated internal resources during the metadata extraction phase, allowing unauthorized network reconnaissance.
The vulnerability stems from the default behavior of the metascraper-logo-favicon plugin when evaluating discovered assets. Upon parsing an HTML document, the plugin actively scans for <link rel="icon"> and <link rel="apple-touch-icon"> attributes.
To ensure the extracted icon is a valid image file, the plugin relies on a dependency called reachable-url. This library invokes the got HTTP client to perform active HTTP GET or HEAD requests against the discovered icon URLs.
These verification requests originate from a subprocess named parseHtmlSubprocess.ts. This execution environment operates independently of Karakeep's primary validateUrl function and its secure proxy infrastructure.
The underlying got client within the subprocess lacks network protections such as IP blocklists, loopback restrictions, or egress proxy enforcement. The system processes any URL embedded in the HTML body as a trusted input for these secondary network calls.
Exploitation requires the attacker to control an external web server and possess the ability to submit URLs to the Karakeep bookmarking service. Authentication prerequisites depend strictly on the deployment configuration of the target application.
The attacker constructs a malicious webpage hosted on their external server. The HTML response contains standard tags but includes a crafted <link rel="icon"> attribute pointing to an internal IP address or a sensitive cloud metadata endpoint.
The attacker submits the URL of the malicious webpage to Karakeep. The main application evaluates the external URL, confirms it resolves to a public IP address, and forwards the task to the worker process.
The worker process retrieves the malicious webpage and parses the Document Object Model (DOM). The metascraper plugin identifies the crafted icon link and initiates an internal HTTP request to verify the resource, finalizing the SSRF execution.
Successful exploitation grants the attacker unauthorized read access to internal network resources. The vulnerability facilitates horizontal movement and network reconnaissance specifically from the perspective of the worker node.
In cloud-hosted environments, attackers frequently target the Instance Metadata Service (IMDS). A crafted URL pointing to http://169.254.169.254/latest/meta-data/ forces the worker to return sensitive infrastructure details or temporary identity and access management (IAM) credentials.
Attackers utilize the vulnerability to probe local services bound to 127.0.0.1. This enables interaction with unauthenticated administrative interfaces, local caching services, or internal databases running concurrently on the worker host.
The practical severity correlates directly with the network placement of the worker process. Worker instances operating without strict egress filtering present the highest risk of internal exploitation and data exfiltration.
The remediation strategy eliminates the secondary network probe entirely, shifting fetch responsibility back to the secure core application. Karakeep developers implemented a custom wrapper, metascraperSafeFavicon, in Pull Request #2763.
This wrapper overrides the default resolveFaviconUrl function provided by the metascraper-logo-favicon plugin. The new implementation parses the URL string and enforces protocol restrictions without executing arbitrary HTTP requests.
async function resolveSafeFaviconUrl(
faviconUrl: string,
): Promise<FaviconResolution | undefined> {
let url: URL;
try {
url = new URL(faviconUrl);
} catch {
return undefined;
}
// Ensure protocol is safe, preventing file:// or dict:// SSRF schemes
if (url.protocol !== "http:" && url.protocol !== "https:") {
return undefined;
}
// Return the URL string directly without fetching/probing it
return { url: url.toString() };
}The patched logic validates that the protocol is strictly http: or https:. Once validated, the system returns the URL string, delegating the actual fetch operation to the main application's hardened fetchWithProxy function.
Upgrading to Karakeep version v0.32.0 completely resolves the vulnerability. Administrators operating earlier versions must enforce strict network egress policies on worker nodes to restrict access to internal IP spaces and cloud metadata endpoints.
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:N/A:N| Product | Affected Versions | Fixed Version |
|---|---|---|
Karakeep Karakeep | < 0.32.0 | 0.32.0 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-918 |
| Attack Vector | Network |
| CVSS | 8.6 (High) |
| Impact | Information Disclosure / Internal Network Access |
| Exploit Status | PoC Available |
| Fixed Version | v0.32.0 |
The web application receives a URL or similar request from an upstream component and retrieves the contents of this URL, but it does not sufficiently ensure that the request is being sent to the expected destination.