An AI-powered X content strategy coach, built by a dark software factory.
The product: An AI-curated X feed surfaces relevant content delivered to Slack via incoming webhook. A producer agent analyzes engagement data, designs content experiments, and recommends posting/reply strategies for tech/AI thought leadership. The human posts manually — the agent coaches but never has write access to X. This is a proof of concept.
The factory: An implementation of the dark factory concept — a Level 5 autonomous software development system where humans write specifications and evaluate outcomes, but no human writes or reviews code. The system uses Attractor for pipeline orchestration, holdout scenarios for validation, and LLM-as-judge satisfaction scoring for quality gating.
| Directory | What it contains |
|---|---|
.specify/ |
Project specification framework — constitution, project memory, and templates |
attractor/ |
NLSpecs from strongdm/attractor — the pipeline engine (DOT-graph orchestration), coding agent loop, and unified LLM client specifications |
darkfactory/ |
PRD, development roadmap, and operational templates covering validation, digital twin universe, scenario suites, satisfaction scoring, and factory operations |
.github/ |
GitHub Actions CI workflows |
.claude/ |
Claude Code configuration — skill definitions |
The core loop:
Spec Ingestion → Planning → Implementation → Self-Testing →
Holdout Validation → Satisfaction Scoring → [Pass: Ship | Fail: Feedback Loop]
Attractor defines this loop as a directed graph in Graphviz DOT syntax. Each node is an AI task backed by a pluggable handler. The engine traverses the graph, executing handlers, checkpointing state, and routing based on edge conditions. Human gates can pause execution for approval at any point.
- NLSpec — Natural language specification intended to be directly usable by coding agents. The spec is the control plane.
- Holdout Scenarios — Behavioral validation that agents cannot see during development. Early phases use same-repo storage with hook-enforced access controls; later phases move to a physically separate repository. This is the ML holdout set applied to software.
- Satisfaction Scoring — LLM-as-judge evaluates scenario trajectories probabilistically (0.0-1.0 scale), not boolean pass/fail. Uses a different model than the coding agent.
- Digital Twin Universe (DTU) — Docker-based behavioral clones of external services for integration testing at scale without rate limits or API costs.
- Convergence Loop — Generate → validate → identify failures → regenerate, iterating until satisfaction threshold is met.
See darkfactory/docs/ for detailed design documents on each concept.
- Python 3.x
- Claude Code CLI (
claude -pfor headless execution, hooks/skills/subagents for orchestration) - claude-agent-sdk
- Attractor (DOT-graph pipeline orchestration)
- pytest
Specification phase. Product constitution ratified — four principles governing the coaching agent's voice, experimentation model, feedback loop, and scope discipline. Factory infrastructure and implementation specs are forthcoming. This is a proof of concept.
| Source | What it covers |
|---|---|
| StrongDM Software Factory | The original dark factory — story, principles, techniques |
| strongdm/attractor | Open-source NLSpecs for pipeline orchestration |
| The Five Levels | Dan Shapiro's maturity framework — Spicy Autocomplete to Dark Factory |
| METR Developer Study | Why bolting AI onto existing workflows makes developers slower |