Skip to content

No isolated test environment for infrastructure changes #217

@ColinCee

Description

@ColinCee

What's wrong

No way to validate infrastructure changes (Dockerfiles, compose configs, volume mounts, cross-service networking) before deploying. PR #216 needed 4 iterative fixes on the live server — all were runtime issues invisible to unit tests and config validation:

  1. .ssh dir owned by root (container runs as uid 1000)
  2. SSH ignores HOME env var, uses /etc/passwd for home directory resolution
  3. No git user identity configured in ephemeral container
  4. git reset --hard wiping files written before the reset

The current workaround — SSH into the server, checkout a branch, build, and test — is risky and not reproducible.

What done looks like

A mise run test:smoke command validates that changed stacks actually work before merging. Specifically:

  • scripts/smoke-test.sh <stack> builds, starts, health-checks, and tears down a stack in an isolated Docker Compose project (-p test-<stack>)
  • Test stacks join a shared test-homelab Docker network for cross-stack communication via DNS (replacing Tailscale IP references)
  • compose.test.yaml overrides per stack: no host port binds, dummy env vars, shared test network
  • A GHA workflow runs smoke tests on PRs touching stacks/{agents,observability,knowledge}/
  • ADR-018 documents the strategy and limitations

Focused on 3 stacks under active development: agents, observability, knowledge. Pre-built stacks (HA, MQTT, CrowdSec, flight-tracker) are out of scope.

What the agent can't discover

Cross-stack dependencies (from analysis):

  • Alloy config has exactly 2 Tailscale IP references that need overriding for test mode:
    • 100.100.146.119:6060 → CrowdSec metrics scrape target
    • 100.100.146.119:8585 → Agent service metrics scrape target
  • These need a config.test.alloy that uses Docker DNS names on the shared test-homelab network
  • All other cross-stack communication is internal to each compose project (Grafana→Prometheus, knowledge ingest→postgres)

Docker Compose project isolation:

  • -p test-<stack> automatically prefixes volume names (e.g., test-agents_repo-cache), preventing data conflicts
  • Port conflicts are impossible — test stacks don't bind to host ports at all
  • Container names are prefixed too, so no naming collisions

Constraints:

  • Agent worker spawning shares the Docker daemon with production — smoke tests should verify the API starts and responds to health checks, not test full worker lifecycle
  • The GHA workflow must be human-authored (agents cannot modify .github/workflows/)
  • Server has ~12GB free RAM; production containers use ~2GB total. A full parallel test stack is well within budget.

What must not break

  • Production stacks must not be affected (no shared ports, no shared volumes, no shared container names)
  • mise run ci stays fast (~60s) — smoke tests are a separate workflow/task
  • Deploy workflow is unchanged
  • Teardown must be bulletproof (trap in script, always step in GHA) — leftover test containers waste resources

Deliverables

  1. docs/decisions/018-isolated-test-environments.md — ADR
  2. stacks/agents/compose.test.yaml — test overrides
  3. stacks/observability/compose.test.yaml + config.test.alloy — test overrides with DNS-based scrape targets
  4. stacks/knowledge/compose.test.yaml — test overrides
  5. scripts/smoke-test.sh — build/start/health-check/teardown orchestrator
  6. mise.tomltest:smoke and test:smoke:<stack> tasks
  7. .github/workflows/smoke-test.yaml — PR-triggered smoke tests (human-authored, see below)

Out of scope (human follow-up)

  • .github/workflows/smoke-test.yaml — agents cannot modify workflow files. Write this after the smoke-test script is validated.
  • Agent self-validation (future: agents run scripts/smoke-test.sh from their worktree before creating PRs)
  • Home Assistant, MQTT, CrowdSec, flight-tracker smoke tests (pre-built images, rarely break)
  • Cloudflare tunnel testing (requires real token)
  • Full end-to-end tests (SSH push to GitHub, Copilot CLI calls)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions