GHSA-c67j-w6g6-q2cm
CRITICALLangChain serialization injection vulnerability enables secret extraction in dumps/loads APIs
EPSS Exploitation Probability
EPSS (Exploit Prediction Scoring System) is a daily probability model maintained by FIRST.org. It estimates the likelihood a CVE will be exploited in production environments within the next 30 days, derived from real-world threat intelligence signals.
Blast Radius
langchain-core🐍langchain-coreReal-time download stats are indexed for npm and PyPI packages. This vulnerability affects PyPI packages — download data is not available via public APIs for these ecosystems.
Description
Summary
A serialization injection vulnerability exists in LangChain's dumps() and dumpd() functions. The functions do not escape dictionaries with 'lc' keys when serializing free-form dictionaries. The 'lc' key is used internally by LangChain to mark serialized objects. When user-controlled data contains this key structure, it is treated as a legitimate LangChain object during deserialization rather than plain user data.
Attack surface
The core vulnerability was in dumps() and dumpd(): these functions failed to escape user-controlled dictionaries containing 'lc' keys. When this unescaped data was later deserialized via load() or loads(), the injected structures were treated as legitimate LangChain objects rather than plain user data.
This escaping bug enabled several attack vectors:
- Injection via user data: Malicious LangChain object structures could be injected through user-controlled fields like
metadata,additional_kwargs, orresponse_metadata - Class instantiation within trusted namespaces: Injected manifests could instantiate any
Serializablesubclass, but only within the pre-approved trusted namespaces (langchain_core,langchain,langchain_community). This includes classes with side effects in__init__(network calls, file operations, etc.). Note that namespace validation was already enforced before this patch, so arbitrary classes outside these trusted namespaces could not be instantiated.
Security hardening
This patch fixes the escaping bug in dumps() and dumpd() and introduces new restrictive defaults in load() and loads(): allowlist enforcement via allowed_objects="core" (restricted to serialization mappings), secrets_from_env changed from True to False, and default Jinja2 template blocking via init_validator. These are breaking changes for some use cases.
Who is affected?
Applications are vulnerable if they:
- Use
astream_events(version="v1")— The v1 implementation internally uses vulnerable serialization. Note:astream_events(version="v2")is not vulnerable. - Use
Runnable.astream_log()— This method internally uses vulnerable serialization for streaming outputs. - Call
dumps()ordumpd()on untrusted data, then deserialize withload()orloads()— Trusting your own serialization output makes you vulnerable if user-controlled data (e.g., from LLM responses, metadata fields, or user inputs) contains'lc'key structures. - Deserialize untrusted data with
load()orloads()— Directly deserializing untrusted data that may contain injected'lc'structures. - Use
RunnableWithMessageHistory— Internal serialization in message history handling. - Use
InMemoryVectorStore.load()to deserialize untrusted documents. - Load untrusted generations from cache using
langchain-communitycaches. - Load untrusted manifests from the LangChain Hub via
hub.pull. - Use
StringRunEvaluatorChainon untrusted runs. - Use
create_lc_storeorcreate_kv_docstorewith untrusted documents. - Use
MultiVectorRetrieverwith byte stores containing untrusted documents. - Use
LangSmithRunChatLoaderwith runs containing untrusted messages.
The most common attack vector is through LLM response fields like additional_kwargs or response_metadata, which can be controlled via prompt injection and then serialized/deserialized in streaming operations.
Impact
Attackers who control serialized data can extract environment variable secrets by injecting {"lc": 1, "type": "secret", "id": ["ENV_VAR"]} to load environment variables during deserialization (when secrets_from_env=True, which was the old default). They can also instantiate classes with controlled parameters by injecting constructor structures to instantiate any class within trusted namespaces with attacker-controlled parameters, potentially triggering side effects such as network calls or file operations.
Key severity factors:
- Affects the serialization path - applications trusting their own serialization output are vulnerable
- Enables secret extraction when combined with
secrets_from_env=True(the old default) - LLM responses in
additional_kwargscan be controlled via prompt injection
Exploit example
from langchain_core.load import dumps, load
import os
# Attacker injects secret structure into user-controlled data
attacker_dict = {
"user_data": {
"lc": 1,
"type": "secret",
"id": ["OPENAI_API_KEY"]
}
}
serialized = dumps(attacker_dict) # Bug: does NOT escape the 'lc' key
os.environ["OPENAI_API_KEY"] = "sk-secret-key-12345"
deserialized = load(serialized, secrets_from_env=True)
print(deserialized["user_data"]) # "sk-secret-key-12345" - SECRET LEAKED!
Security hardening changes (breaking changes)
This patch introduces three breaking changes to load() and loads():
- New
allowed_objectsparameter (defaults to'core'): Enforces allowlist of classes that can be deserialized. The'all'option corresponds to the list of objects specified inmappings.pywhile the'core'option limits to objects withinlangchain_core. We recommend that users explicitly specify which objects they want to allow for serialization/deserialization. secrets_from_envdefault changed fromTruetoFalse: Disables automatic secret loading from environment- New
init_validatorparameter (defaults todefault_init_validator): Blocks Jinja2 templates by default
Migration guide
No changes needed for most users
If you're deserializing standard LangChain types (messages, documents, prompts, trusted partner integrations like ChatOpenAI, ChatAnthropic, etc.), your code will work without changes:
from langchain_core.load import load
# Uses default allowlist from serialization mappings
obj = load(serialized_data)
For custom classes
If you're deserializing custom classes not in the serialization mappings, add them to the allowlist:
from langchain_core.load import load
from my_package import MyCustomClass
# Specify the classes you need
obj = load(serialized_data, allowed_objects=[MyCustomClass])
For Jinja2 templates
Jinja2 templates are now blocked by default because they can execute arbitrary code. If you need Jinja2 templates, pass init_validator=None:
from langchain_core.load import load
from langchain_core.prompts import PromptTemplate
obj = load(
serialized_data,
allowed_objects=[PromptTemplate],
init_validator=None
)
[!WARNING] Only disable
init_validatorif you trust the serialized data. Jinja2 templates can execute arbitrary Python code.
For secrets from environment
secrets_from_env now defaults to False. If you need to load secrets from environment variables:
from langchain_core.load import load
obj = load(serialized_data, secrets_from_env=True)
Credits
- Dumps bug was reported by @yardenporat
- Changes for security hardening due to findings from @0xn3va and @VladimirEliTokarev
Affected Packages
| Ecosystem | Package | Vulnerable range | Fix |
|---|---|---|---|
| 🐍PyPI | langchain-core | ≥ 1.0.0&&< 1.2.5 | 1.2.5 |
| 🐍PyPI | langchain-core | all versions | 0.3.81 |
Research use only. For defensive security, authorized penetration testing, and academic research only. Never execute exploit code against systems without explicit written authorization.
LangChain Core 1.2.4 - SSTI/RCE
by banyamer · Apr 29, 2026
Detection & mitigation playbook
Open-source dependencyDetect
Scan your dependency tree (package-lock.json, pnpm-lock.yaml, requirements.txt, go.sum, etc.) for langchain-core. O3's reachability analysis confirms whether the vulnerable code path is actually invoked in your application, so you act on real exposure instead of every transitive match.
Fix
Update langchain-core to 1.2.5 or later, then make sure no transitive (indirect) dependency still pins the vulnerable range — O3 confirms GHSA-c67j-w6g6-q2cm is resolved across your whole dependency graph.
Workarounds
If you can't upgrade right away: gate or disable the affected feature, validate untrusted input at the boundary, and avoid passing attacker-controlled data into the vulnerable path. O3's runtime protection blocks exploitation in production as an interim safeguard until the upgrade lands.
How O3 protects you
O3 pinpoints whether GHSA-c67j-w6g6-q2cm is reachable in your code and exactly where to fix it, then blocks exploitation in production at runtime until the patched version is deployed.
Tailored to GHSA-c67j-w6g6-q2cm. Runtime protection reduces exposure until a permanent patch is applied and verified — it complements patching, it doesn't replace it.
Frequently Asked Questions
Is GHSA-c67j-w6g6-q2cm in your dependencies?
O3 detects GHSA-c67j-w6g6-q2cm across PyPI dependencies and uses function-level reachability to confirm whether the vulnerable code path is actually reachable — not just present. No false positives.