Skip to content

Arbitrary file write (symlink/TOCTOU) plus log and webhook-header injection in Docker server

High
unclecode published GHSA-7cx2-g3h9-382p Jun 4, 2026

Package

pip crawl4ai (pip)

Affected versions

<= 0.8.7

Patched versions

0.8.8

Description

Summary

Three backward-compatible hardening fixes in the Docker API server. The headline issue is an arbitrary file write via the screenshot/PDF output_path.

1. Arbitrary file write via output_path symlink / TOCTOU (primary)

POST /screenshot and POST /pdf accept an output_path constrained to ALLOWED_OUTPUT_DIR by validate_output_path. The 0.8.7 check was string-only: it did not resolve symlinks, so a symlinked path component inside the output directory could redirect the write outside the directory, and the final open() followed symlinks. On a deployment where the runtime user can write executable/cron locations this is an arbitrary-write to code-execution primitive. The API is unauthenticated by default.

Fix: validate_output_path now resolves the real path (symlinks) of the parent and re-checks containment, and the write uses O_NOFOLLOW (write_output_file). output_path remains supported.

2. CRLF log injection (CWE-117)

User-controlled URLs/errors reflected into log lines could embed CR/LF and forge additional log entries. Fix: a logging filter strips CR/LF/control characters from all records.

3. Webhook request-header injection (CWE-93/CWE-113)

User-supplied webhook headers were sent verbatim, allowing CRLF and hop-by-hop / sensitive header injection on the outbound webhook request. Fix: webhook headers are validated (name pattern, no control characters, deny Host/Content-Length/Transfer-Encoding/Authorization/Cookie/...), with early request-time rejection.

Impact

Arbitrary file write (potential code execution) for #1; log forging for #2; request smuggling / header injection on outbound webhooks for #3.

Workarounds

  • Upgrade to the patched version.
  • Enable authentication (CRAWL4AI_API_TOKEN).
  • Run the container with a read-only root filesystem.

Credits

Internal security audit (Crawl4AI maintainers).

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
High
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
High
Integrity
High
Availability
High

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:H/I:H/A:H

CVE ID

No known CVE

Weaknesses

Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')

The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory. Learn more on MITRE.

Improper Link Resolution Before File Access ('Link Following')

The product attempts to access a file based on the filename, but it does not properly prevent that filename from identifying a link or shortcut that resolves to an unintended resource. Learn more on MITRE.

Improper Neutralization of CRLF Sequences ('CRLF Injection')

The product uses CRLF (carriage return line feeds) as a special element, e.g. to separate lines or records, but it does not neutralize or incorrectly neutralizes CRLF sequences from inputs. Learn more on MITRE.

Improper Output Neutralization for Logs

The product constructs a log message from external input, but it does not neutralize or incorrectly neutralizes special elements when the message is written to a log file. Learn more on MITRE.