GHSA-cwxj-rr6w-m6w7
HIGHScrapy: Arbitrary Module Import via Referrer-Policy Header in RefererMiddleware
Blast Radius
scrapyReal-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.
Description
Impact
Since version 1.4.0, Scrapy respects the Referrer-Policy response header to decide whether and how to set a Referer header on follow-up requests.
If the header value looked like a valid Python import path, Scrapy would import the referenced object and call it, assuming it referred to a referrer policy class (for example, scrapy.spidermiddlewares.referer.DefaultReferrerPolicy) and attempting to instantiate it to handle the Referer header.
A malicious site could exploit this by setting Referrer-Policy to a path such as sys.exit, causing Scrapy to import and execute it and potentially terminate the process.
Patches
Upgrade to Scrapy 2.14.2 (or later).
Workarounds
If you cannot upgrade to Scrapy 2.14.2, consider the following mitigations.
- Disable the middleware: If you don't need the
Refererheader on follow-up requests, setREFERER_ENABLEDtoFalse. - Set headers manually: If you do need a
Referer, disable the middleware and set the header explicitly on the requests that require it. - Set
referrer_policyin request metadata: If disabling the middleware is not viable, set thereferrer_policyrequest meta key on all requests to prevent evaluating preceding responses'Referrer-Policy. For example:
Request(
url,
meta={
"referrer_policy": "scrapy.spidermiddlewares.referer.DefaultReferrerPolicy",
},
)
Instead of editing requests individually, you can:
- implement a custom spider middleware that runs before the built-in referrer policy middleware and sets the
referrer_policymeta key; or - set the meta key in start requests and use the scrapy-sticky-meta-params plugin to propagate it to follow-up requests.
If you want to continue respecting legitimate Referrer-Policy headers while protecting against malicious ones, disable the built-in referrer policy middleware by setting it to None in SPIDER_MIDDLEWARES and replace it with the fixed implementation from Scrapy 2.14.2.
If the Scrapy 2.14.2 implementation is incompatible with your project (for example, because your Scrapy version is older), copy the corresponding middleware from your Scrapy version, apply the same patch, and use that as a replacement.
Affected Packages
| Ecosystem | Package | Vulnerable range | Fix |
|---|---|---|---|
| 🐍PyPI | scrapy | ≥ 1.4.0&&< 2.14.2 | 2.14.2 |
Detection & mitigation playbook
Open-source dependencyDetect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for scrapy. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Fix
Update scrapy to 2.14.2 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-cwxj-rr6w-m6w7 is resolved across your whole dependency graph.
Workarounds
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-cwxj-rr6w-m6w7 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.
Tailored to GHSA-cwxj-rr6w-m6w7. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.
Frequently Asked Questions
Is GHSA-cwxj-rr6w-m6w7 in your dependencies?
O3 detects GHSA-cwxj-rr6w-m6w7 across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.