GHSA-9f8f-2vmf-885j
HIGHData exposure via ZeroMQ on multi-node vLLM deployment
EPSS Exploitation Probability
EPSS (Exploit Prediction Scoring System) is a daily probability model maintained by FIRST.org. It estimates the likelihood a CVE will be exploited in production environments within the next 30 days, derived from real-world threat intelligence signals.
Blast Radius
vllmReal-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.
Description
Impact
In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts.
Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker.
By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher.
Detailed Analysis
The XPUB socket in question is created here:
Data is published over this socket via MessageQueue.enqueue() which is called by MessageQueue.broadcast_object():
The MessageQueue.broadcast_object() method is called by the GroupCoordinator.broadcast_object() method in parallel_state.py:
The broadcast over ZeroMQ is only done if the GroupCoordinator was created with use_message_queue_broadcaster set to True:
The only case where GroupCoordinator is created with use_message_queue_broadcaster is the coordinator for the tensor parallelism group:
To determine what data is broadcasted to the tensor parallism group, we must continue tracing. GroupCoordinator.broadcast_object() is called by GroupCoordinator.broadcoast_tensor_dict():
which is called by broadcast_tensor_dict() in communication_op.py:
If we look at _get_driver_input_and_broadcast() in the V0 worker_base.py, we'll see how this tensor dict is formed:
but the data actually sent over ZeroMQ is the metadata_list portion that is split from this tensor_dict. The tensor parts are sent via torch.distributed and only metadata about those tensors is sent via ZeroMQ.
Patches
Workarounds
Prior to the fix, your options include:
- Do not expose the vLLM host to a network where any untrusted connections may reach the host.
- Ensure that only the other vLLM hosts are able to connect to the TCP port used for the
XPUBsocket. Note that port used is random.
References
- Relevant code first introduced in https://github.com/vllm-project/vllm/pull/6183
Affected Packages
| Ecosystem | Package | Vulnerable range | Fix |
|---|---|---|---|
| 🐍PyPI | vllm | ≥ 0.5.2&&< 0.8.5 | 0.8.5 |
Detection & mitigation playbook
Open-source dependencyDetect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for vllm. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Fix
Update vllm to 0.8.5 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-9f8f-2vmf-885j is resolved across your whole dependency graph.
Workarounds
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-9f8f-2vmf-885j is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.
Tailored to GHSA-9f8f-2vmf-885j. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.
Frequently Asked Questions
Is GHSA-9f8f-2vmf-885j in your dependencies?
O3 detects GHSA-9f8f-2vmf-885j across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.