GHSA-w6q7-j642-7c25
MEDIUMvLLM has a Regular Expression Denial of Service (ReDoS, Exponential Complexity) Vulnerability in `pythonic_tool_parser.py`
EPSS Exploitation Probability
EPSS (Exploit Prediction Scoring System) is a daily probability model maintained by FIRST.org. It estimates the likelihood a CVE will be exploited in production environments within the next 30 days, derived from real-world threat intelligence signals.
Blast Radius
vllmReal-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.
Description
Summary
A Regular Expression Denial of Service (ReDoS) vulnerability exists in the file vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py of the vLLM project. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable.
Details
The following regular expression is used to match tool/function call patterns:
r"\[([a-zA-Z]+\w*\(([a-zA-Z]+\w*=.*,\s*)*([a-zA-Z]+\w*=.*\s)?\),\s*)*([a-zA-Z]+\w*\(([a-zA-Z]+\w*=.*,\s*)*([a-zA-Z]+\w*=.*\s*)?\)\s*)+\]"
This pattern contains multiple nested quantifiers (*, +), optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking.
Attack Example: A malicious input such as
[A(A= )A(A=, )A(A=, )A(A=, )... (repeated dozens of times) ...]
or
"[A(A=" + "\t)A(A=,\t" * repeat
can cause the regular expression engine to consume CPU exponentially with the input length, effectively freezing or crashing the server (DoS).
Proof of Concept: A Python script demonstrates that matching such a crafted string with the above regex results in exponential time complexity. Even moderate input lengths can bring the system to a halt.
Length: 22, Time: 0.0000 seconds, Match: False
Length: 38, Time: 0.0010 seconds, Match: False
Length: 54, Time: 0.0250 seconds, Match: False
Length: 70, Time: 0.5185 seconds, Match: False
Length: 86, Time: 13.2703 seconds, Match: False
Length: 102, Time: 319.0717 seconds, Match: False
Impact
- Denial of Service (DoS): An attacker can trigger a denial of service by sending specially crafted payloads to any API or interface that invokes this regex, causing excessive CPU usage and making the vLLM service unavailable.
- Resource Exhaustion and Memory Retention: As this regex is invoked during function call parsing, the matching process may hold on to significant CPU and memory resources for extended periods (due to catastrophic backtracking). In the context of vLLM, this also means that the associated KV cache (used for model inference and typically stored in GPU memory) is not released in a timely manner. This can lead to GPU memory exhaustion, degraded throughput, and service instability.
- Potential for Broader System Instability: Resource exhaustion from stuck or slow requests may cascade into broader system instability or service downtime if not mitigated.
Fix
- https://github.com/vllm-project/vllm/pull/18454
- Note that while this change has significantly improved performance, this regex may still be problematic. It has gone from exponential time complexity, O(2^N), to O(N^2).
Affected Packages
| Ecosystem | Package | Vulnerable range | Fix |
|---|---|---|---|
| 🐍PyPI | vllm | ≥ 0.6.4&&< 0.9.0 | 0.9.0 |
Detection & mitigation playbook
Open-source dependencyDetect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for vllm. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Fix
Update vllm to 0.9.0 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-w6q7-j642-7c25 is resolved across your whole dependency graph.
Workarounds
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-w6q7-j642-7c25 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.
Tailored to GHSA-w6q7-j642-7c25. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.
Frequently Asked Questions
Is GHSA-w6q7-j642-7c25 in your dependencies?
O3 detects GHSA-w6q7-j642-7c25 across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.