May 21, 2026·6 min read·2 visits
SageMaker Python SDK leaked symmetric HMAC keys in job environment variables, allowing attackers to forge signatures and achieve RCE via malicious model artifacts.
The Amazon SageMaker Python SDK is vulnerable to arbitrary code execution due to the cleartext storage of a symmetric HMAC signing key in job environment variables. An authenticated attacker with `Describe` permissions can extract this key to forge valid integrity signatures for malicious model artifacts.
The Amazon SageMaker Python SDK facilitates the development, training, and deployment of machine learning models on AWS. The SDK includes a "Remote Function" capability that allows developers to serialize Python functions using the cloudpickle library and upload them to Amazon S3 for execution in remote inference containers.
CVE-2026-8596 identifies a critical design flaw within this Remote Function implementation. The SDK utilizes cryptographic signatures to verify the integrity of serialized objects before deserialization, preventing the execution of tampered payloads.
In vulnerable versions, the SDK implements this integrity check using a symmetric HMAC-SHA256 key. The SDK stores this key in cleartext within the environment variables of the remote execution container.
Because these environment variables are accessible via the SageMaker Describe APIs, the design violates CWE-312 (Cleartext Storage of Sensitive Information). An authenticated attacker can extract the key, bypass the integrity mechanism, and achieve object injection.
The root cause of CVE-2026-8596 is the architectural decision to distribute symmetric cryptographic material through observable infrastructure configurations. When a user creates a Remote Function, the client SDK generates an HMAC-SHA256 secret key.
The SDK transmits this key to the target SageMaker job by injecting it into the REMOTE_FUNCTION_SECRET_KEY environment variable. The system design assumes the environment variable remains confined to the execution boundary of the inference container.
This assumption is fundamentally flawed in the context of the AWS control plane. The DescribeTrainingJob and related SageMaker APIs return the complete job configuration upon request, including the user-defined environment variables.
Any IAM principal possessing the sagemaker:DescribeTrainingJob permission can retrieve the exact symmetric key used to secure the payload. Since the key is symmetric, possessing it grants the attacker both validation and signing capabilities.
The vulnerability stems from the use of symmetric HMAC operations where asymmetric cryptography is required for the trust model. The client SDK originally generated a random secret, packed it into the environment, and calculated an HMAC over the serialized cloudpickle object.
The remote container retrieved the secret via os.environ.get("REMOTE_FUNCTION_SECRET_KEY") and performed an identical HMAC calculation to verify data integrity. The fix implemented in Pull Request #5708 replaces this mechanism entirely with ECDSA P-256 signatures.
The patched client SDK now generates an asymmetric key pair. The private key remains locally in memory on the client system and signs the serialized object payload.
The SDK passes only the public key component to the remote environment via the environment variable. The following code snippet demonstrates the updated verification logic in serialization.py where the container utilizes the public key to validate the signature:
def _verify_asymmetric_signature(metadata: _MetaData, buffer: bytes, public_key_pem: str):
# Verification using ECDSA and the public key provided in metadata
signature_bytes = base64.b64decode(metadata.asymmetric_signature)
public_key = crypto_serialization.load_pem_public_key(public_key_pem.encode())
public_key.verify(signature_bytes, buffer, ec.ECDSA(hashes.SHA256()))By transitioning to ECDSA, the patch eliminates the exposure of signing capabilities. Even if an attacker reads the REMOTE_FUNCTION_SECRET_KEY environment variable, they obtain only the public key, rendering them unable to forge valid signatures for malicious payloads.
Exploitation of CVE-2026-8596 requires a specific combination of IAM permissions. The attacker must hold authorization to call Describe APIs on SageMaker jobs and write access to the specific Amazon S3 bucket storing the model artifacts.
The attack sequence begins with the adversary targeting an active or historical SageMaker job. The attacker issues a DescribeTrainingJob API request to the AWS control plane to extract the REMOTE_FUNCTION_SECRET_KEY from the job's environment variables.
Upon obtaining the symmetric key, the attacker constructs a malicious Python payload designed to execute arbitrary operating system commands. The attacker compiles this payload into a serialized object using the cloudpickle library.
The attacker signs the malicious object using the stolen HMAC-SHA256 key and updates the corresponding metadata.json file. Finally, the attacker overwrites the legitimate model artifacts in S3 using the s3:PutObject operation.
When the SageMaker inference container initializes or processes a new task, it downloads the tampered payload. The container validates the forged signature, determines the payload is authentic, and deserializes the object, granting the attacker arbitrary code execution.
The exploitation of CVE-2026-8596 leads directly to unauthenticated remote code execution within the context of the SageMaker inference container. Deserialization of untrusted cloudpickle data inherently allows the execution of arbitrary Python bytecode or system commands.
This execution occurs with the privileges of the SageMaker container process. An attacker leverages this position to access any IAM credentials attached to the SageMaker execution role, allowing privilege escalation horizontally or vertically within the AWS environment.
The attacker also gains unauthorized access to proprietary machine learning models, training data, and any interconnected database credentials present in the execution environment. The ability to intercept and modify inference results severely compromises data integrity.
The CVSS v3.1 base score of 7.2 accurately reflects the high confidentiality, integrity, and availability impacts. The requirement for specific IAM permissions correctly limits the attack vector, preventing anonymous exploitation over the open internet.
The primary remediation for CVE-2026-8596 is upgrading the Amazon SageMaker Python SDK to a patched version. AWS addressed the vulnerability in SDK v2 series version 2.257.2 and v3 series version 3.8.0.
Patching the client SDK prevents the creation of new vulnerable jobs. Organizations must also systematically rebuild and redeploy any existing models or remote functions created with older, vulnerable versions of the SDK.
Rebuilding ensures the SDK utilizes the new ECDSA asymmetric key process for the entire lifecycle of the model artifact. AWS provides the ModelBuilder tool within the SDK to facilitate this redevelopment process.
Organizations enforce proactive security through AWS IAM policy audits as a defense-in-depth measure. Administrators restrict sagemaker:DescribeTrainingJob and s3:PutObject permissions strictly to the principals requiring them for functional operations.
CVSS:3.1/AV:N/AC:L/PR:H/UI:N/S:U/C:H/I:H/A:H| Product | Affected Versions | Fixed Version |
|---|---|---|
Amazon SageMaker Python SDK (v2) AWS | >= 2.199.0, < 2.257.2 | 2.257.2 |
Amazon SageMaker Python SDK (v3) AWS | >= 3.0.0, < 3.8.0 | 3.8.0 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-312 |
| Attack Vector | Network |
| CVSS Score | 7.2 (High) |
| EPSS Score | 0.10% |
| Impact | Arbitrary Code Execution |
| Exploit Status | Proof of Concept |
| KEV Status | Not Listed |
The application stores sensitive information in cleartext in a location accessible to unauthorized actors or via unintended API responses.