Summary
Add a Dockerfile and docker-compose.yml so the pipeline runs identically on any machine without manual environment setup.
Motivation
Current setup requires Anaconda, specific Python version, and system dependencies for WeasyPrint (Cairo, Pango). This is the most common friction point for new users. A Docker image eliminates this entirely.
Proposed Structure
# Dockerfile
FROM python:3.11-slim
# WeasyPrint system deps
RUN apt-get update && apt-get install -y \
libcairo2 libpango-1.0-0 libpangocairo-1.0-0 \
libgdk-pixbuf2.0-0 libffi-dev shared-mime-info
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
ENTRYPOINT ["python", "main_v2.py"]
# docker-compose.yml
services:
research-engine:
build: .
environment:
- SEC_USER_AGENT=${SEC_USER_AGENT}
volumes:
- ./outputs:/app/outputs
- ./data:/app/data
command: build-all --as-of 2026-03-27
Acceptance Criteria
Summary
Add a
Dockerfileanddocker-compose.ymlso the pipeline runs identically on any machine without manual environment setup.Motivation
Current setup requires Anaconda, specific Python version, and system dependencies for WeasyPrint (Cairo, Pango). This is the most common friction point for new users. A Docker image eliminates this entirely.
Proposed Structure
Acceptance Criteria
Dockerfilebuilds successfullydocker-compose upruns a fullbuild-alland writes output to./outputs/SEC_USER_AGENTpassed via environment variable (not hardcoded).dockerignoreadded to excludeoutputs/,data/cache/,.venv/