Summary
Three backward-compatible hardening fixes in the Docker API server. The headline issue is an arbitrary file write via the screenshot/PDF output_path.
1. Arbitrary file write via output_path symlink / TOCTOU (primary)
POST /screenshot and POST /pdf accept an output_path constrained to ALLOWED_OUTPUT_DIR by validate_output_path. The 0.8.7 check was string-only: it did not resolve symlinks, so a symlinked path component inside the output directory could redirect the write outside the directory, and the final open() followed symlinks. On a deployment where the runtime user can write executable/cron locations this is an arbitrary-write to code-execution primitive. The API is unauthenticated by default.
Fix: validate_output_path now resolves the real path (symlinks) of the parent and re-checks containment, and the write uses O_NOFOLLOW (write_output_file). output_path remains supported.
2. CRLF log injection (CWE-117)
User-controlled URLs/errors reflected into log lines could embed CR/LF and forge additional log entries. Fix: a logging filter strips CR/LF/control characters from all records.
3. Webhook request-header injection (CWE-93/CWE-113)
User-supplied webhook headers were sent verbatim, allowing CRLF and hop-by-hop / sensitive header injection on the outbound webhook request. Fix: webhook headers are validated (name pattern, no control characters, deny Host/Content-Length/Transfer-Encoding/Authorization/Cookie/...), with early request-time rejection.
Impact
Arbitrary file write (potential code execution) for #1; log forging for #2; request smuggling / header injection on outbound webhooks for #3.
Workarounds
- Upgrade to the patched version.
- Enable authentication (
CRAWL4AI_API_TOKEN).
- Run the container with a read-only root filesystem.
Credits
Internal security audit (Crawl4AI maintainers).
References
Summary
Three backward-compatible hardening fixes in the Docker API server. The headline issue is an arbitrary file write via the screenshot/PDF
output_path.1. Arbitrary file write via output_path symlink / TOCTOU (primary)
POST /screenshotandPOST /pdfaccept anoutput_pathconstrained toALLOWED_OUTPUT_DIRbyvalidate_output_path. The 0.8.7 check was string-only: it did not resolve symlinks, so a symlinked path component inside the output directory could redirect the write outside the directory, and the finalopen()followed symlinks. On a deployment where the runtime user can write executable/cron locations this is an arbitrary-write to code-execution primitive. The API is unauthenticated by default.Fix:
validate_output_pathnow resolves the real path (symlinks) of the parent and re-checks containment, and the write usesO_NOFOLLOW(write_output_file).output_pathremains supported.2. CRLF log injection (CWE-117)
User-controlled URLs/errors reflected into log lines could embed CR/LF and forge additional log entries. Fix: a logging filter strips CR/LF/control characters from all records.
3. Webhook request-header injection (CWE-93/CWE-113)
User-supplied webhook headers were sent verbatim, allowing CRLF and hop-by-hop / sensitive header injection on the outbound webhook request. Fix: webhook headers are validated (name pattern, no control characters, deny
Host/Content-Length/Transfer-Encoding/Authorization/Cookie/...), with early request-time rejection.Impact
Arbitrary file write (potential code execution) for #1; log forging for #2; request smuggling / header injection on outbound webhooks for #3.
Workarounds
CRAWL4AI_API_TOKEN).Credits
Internal security audit (Crawl4AI maintainers).
References