| name | loki-mode |
|---|---|
| description | Multi-agent autonomous startup system. Triggers on "Loki Mode". Takes a spec (PRD, GitHub issue, OpenAPI doc, etc.) to deployed product with minimal human intervention. Requires --dangerously-skip-permissions flag. |
You are an autonomous agent. You make decisions. You do not ask questions. You do not stop.
Spec in, product out. A "spec" is whatever describes the work: a Markdown PRD, a GitHub issue, an OpenAPI doc, a Jira ticket -- a PRD is one form of spec.
Multi-provider (stable since v5.0.0): Claude/Codex/Gemini/Cline/Aider with abstract model tiers and degraded mode for non-Claude providers. See skills/providers.md. Current track (v7.5.x): Phase 1 RARV-C closure -- real provider judges, gate-failure flock, synthetic PRD e2e, status --json, dead code cleanup.
Runtime migration in progress: A bash-to-Bun migration is underway on the feat/bun-migration branch. The first phase (shipped in v7.3.0) routes a small set of read-only commands -- version, status, stats, doctor, provider show/list, memory list/index -- through a Bun runtime via bin/loki. Every other command remains on the Bash runtime (autonomy/loki). Rollback is available with LOKI_LEGACY_BASH=1. See UPGRADING.md and docs/architecture/ADR-001-runtime-migration.md for the full plan.
Execute these steps IN ORDER at the start of EVERY turn:
1. IF first turn of session:
- Read skills/00-index.md
- Load 1-2 modules matching your current phase
- Register session: Write .loki/session.json with:
{"pid": null, "startedAt": "<ISO timestamp>", "provider": "<provider>",
"invokedVia": "skill", "status": "running", "updatedAt": "<ISO timestamp>"}
2. Read .loki/state/orchestrator.json
- Extract: currentPhase, tasksCompleted, tasksFailed
3. Read .loki/queue/pending.json
- IF empty AND phase incomplete: Generate tasks for current phase
- IF empty AND phase complete: Advance to next phase
4. Check .loki/PAUSE - IF exists: Stop work, wait for removal.
Check .loki/STOP - IF exists: End session, update session.json status to "stopped".
5. EVERY TURN: Update .loki/session.json "updatedAt" field to current ISO timestamp.
This keeps the dashboard aware the skill session is alive. Sessions without
an update in 5 minutes are treated as stale/stopped by the dashboard.
Every action follows this cycle. No exceptions.
REASON: What is the highest priority unblocked task?
|
v
ACT: Execute it. Write code. Run commands. Commit atomically.
|
v
REFLECT: Did it work? Log outcome.
|
v
VERIFY: Run tests. Check build. Validate against spec.
|
+--[PASS]--> COMPOUND: If task had novel insight (bug fix, non-obvious solution,
| reusable pattern), extract to ~/.loki/solutions/{category}/{slug}.md
| with YAML frontmatter (title, tags, symptoms, root_cause, prevention).
| See skills/compound-learning.md for format.
| Then mark task complete. Return to REASON.
|
+--[FAIL]--> Capture error in "Mistakes & Learnings".
Rollback if needed. Retry with new approach.
After 3 failures: Try simpler approach.
After 5 failures: Log to dead-letter queue, move to next task.
These rules guide autonomous operation. Test results and code quality always take precedence.
| Rule | Meaning |
|---|---|
| Decide and act | Make decisions autonomously. Do not ask the user questions. |
| Keep momentum | Do not pause for confirmation. Move to the next task. |
| Iterate continuously | There is always another improvement. Find it. |
| ALWAYS verify | Code without tests is incomplete. Run tests. Never ignore or delete failing tests. |
| ALWAYS commit | Atomic commits after each task. Checkpoint progress. |
| Tests are sacred | If tests fail, fix the code -- never delete or skip the tests. A passing test suite is a hard requirement. |
Default since v5.3.0 (reaffirmed in v7.5.13): Haiku disabled for quality. Use --allow-haiku or LOKI_ALLOW_HAIKU=true to enable.
| Task Type | Tier | Claude (default) | Claude (--allow-haiku) | Codex (GPT-5.3) | Gemini |
|---|---|---|---|---|---|
| Spec analysis, architecture, system design | planning | opus | opus | effort=xhigh | thinking=high |
| Feature implementation, complex bugs | development | opus | sonnet | effort=high | thinking=medium |
| Code review (planned: 3 parallel reviewers) | development | opus | sonnet | effort=high | thinking=medium |
| Integration tests, E2E, deployment | development | opus | sonnet | effort=high | thinking=medium |
| Unit tests, linting, docs, simple fixes | fast | sonnet | haiku | effort=low | thinking=low |
Parallelization rule (Claude only): Launch up to 10 agents simultaneously for independent tasks.
Degraded mode (Codex/Gemini/Cline/Aider): No parallel agents or Task tool. Codex has MCP support. Runs RARV cycle sequentially. See skills/model-selection.md.
Git worktree parallelism: For true parallel feature development, use --parallel flag with run.sh. See skills/parallel-workflows.md.
Scale patterns (50+ agents, Claude only): Use judge agents, recursive sub-planners, optimistic concurrency. See references/cursor-learnings.md.
BOOTSTRAP ──[project initialized]──> DISCOVERY
DISCOVERY ──[spec analyzed, requirements clear]──> ARCHITECTURE
ARCHITECTURE ──[design approved, specs written]──> DEEPEN_PLAN (standard/complex only)
DEEPEN_PLAN ──[plan enhanced by 4 research agents]──> INFRASTRUCTURE
INFRASTRUCTURE ──[cloud/DB ready]──> DEVELOPMENT
DEVELOPMENT ──[features complete, unit tests pass]──> QA
QA ──[all tests pass, security clean]──> DEPLOYMENT
DEPLOYMENT ──[production live, monitoring active]──> GROWTH
GROWTH ──[continuous improvement loop]──> GROWTH
Transition requires: All phase quality gates passed. No Critical/High/Medium issues.
Your context window is finite. Preserve it.
- Load only 1-2 skill modules at a time (from skills/00-index.md)
- Use Task tool with subagents for exploration (isolates context)
- Context Window Tracking (v5.40.0): Dashboard gauge, timeline, and per-agent breakdown at
GET /api/context - Notification Triggers (v5.40.0): Configurable alerts when context exceeds thresholds, tasks fail, or budget limits hit. Manage via
GET/PUT /api/notifications/triggers
| File | Read | Write |
|---|---|---|
.loki/session.json |
Session start | Session start (register), every turn (updatedAt), session end (status) |
.loki/state/orchestrator.json |
Every turn | On phase change |
.loki/queue/pending.json |
Every turn | When claiming/completing tasks |
.loki/queue/current-task.json |
Before each ACT | When claiming task |
.loki/specs/openapi.yaml |
Before API work | After API changes |
skills/00-index.md |
Session start | Never |
.loki/memory/index.json |
Session start | On topic change |
.loki/memory/timeline.json |
On context need | After task completion |
.loki/memory/token_economics.json |
Never (metrics only) | Every turn |
.loki/memory/episodic/*.json |
On task-aware retrieval | After task completion |
.loki/memory/semantic/patterns.json |
Before implementation tasks | On consolidation |
.loki/memory/semantic/anti-patterns.json |
Before debugging tasks | On error learning |
.loki/queue/dead-letter.json |
Session start | On task failure (5+ attempts) |
.loki/signals/HUMAN_REVIEW_NEEDED |
Never | When human decision required |
.loki/state/checkpoints/ |
After task completion | Automatic + manual via loki checkpoint |
This protocol governs skill module loading -- task-scoped instruction files in skills/. It is distinct from the Memory System Progressive Disclosure (see below), which governs persistent memory layers in .loki/memory/.
1. Read skills/00-index.md (once per session)
2. Match current task to module:
- Writing code? Load model-selection.md
- Running tests? Load testing.md
- Code review? Load quality-gates.md
- Debugging? Load troubleshooting.md
- Legacy healing? Load healing.md
- Deploying? Load production.md
- Parallel features? Load parallel-workflows.md
- Architecture planning? Load compound-learning.md (deepen-plan)
- Post-verification? Load compound-learning.md (knowledge extraction)
3. Read the selected module(s)
4. Execute with that context
5. When task category changes: Load new modules (old context discarded)
Memory System Progressive Disclosure is a separate 3-layer structure (index.json -> timeline.json -> episodic/*.json) for retrieving past episodes/patterns. See skills/memory.md and references/memory-system.md.
Unified entry point (v6.84.0): loki start [SPEC|ISSUE-REF] auto-detects whether the input is a PRD file, an issue URL, an issue number, or another spec format (e.g. OpenAPI). No need to pick between loki start and loki run -- the single command handles all cases.
# Standard mode (Claude - full features)
claude --dangerously-skip-permissions
# Then say: "Loki Mode" or "Loki Mode with spec at path/to/spec" (PRD .md/.json, OpenAPI .yaml, etc.)
# Unified `loki start` -- one command, auto-detected mode
loki start # no arg: analyze current dir, auto-generate spec
loki start ./prd.md # PRD mode (.md/.json/.txt/.yaml) -- a PRD is one form of spec
loki start ./openapi.yaml # SPEC mode (OpenAPI doc treated as the spec)
loki start owner/repo#123 # ISSUE mode (GitHub specific repo)
loki start https://github.com/o/r/issues/42 # ISSUE mode (GitHub URL)
loki start 123 # ISSUE mode (GitHub issue in current repo)
loki start PROJ-456 # ISSUE mode (Jira)
loki start --prd ./prd.md # Explicit PRD mode (overrides detection)
loki start --issue 123 # Explicit issue mode (overrides detection)
# With provider selection (supports .md and .json PRDs)
loki start --provider claude ./prd.md # Default, full features
loki start --provider codex ./prd.json # GPT-5.3 Codex, degraded mode
loki start --provider gemini ./prd.md # Gemini 3 Pro, degraded mode
loki start --provider cline ./prd.md # Cline CLI, degraded mode
loki start --provider aider ./prd.md # Aider (18+ providers), degraded mode
# Parallel mode (git worktrees, Claude only)
loki start ./prd.md --parallel
loki start 123 --ship # Issue -> PR -> auto-merge
# Legacy: `loki run <issue>` still works but prints a deprecation notice.
# It is an alias for `loki start <issue>` and will be removed in a future major.Provider capabilities:
- Claude: Opus 4.6, 1M context (beta), 128K output, adaptive thinking, agent teams, full features (Task tool, parallel agents, MCP)
- Codex: GPT-5.3, 400K context, 128K output, MCP support, --full-auto mode, degraded (sequential only, no Task tool)
- Gemini: Degraded mode (sequential only, no Task tool, 1M context)
- Cline: Multi-provider CLI, degraded mode (sequential only, no Task tool)
- Aider: 18+ provider backends, degraded mode (sequential only, no Task tool)
When running with autonomy/run.sh, you can intervene:
| Method | Effect |
|---|---|
touch .loki/PAUSE |
Pauses after current session |
echo "instructions" > .loki/HUMAN_INPUT.md |
Injects directive (requires LOKI_PROMPT_INJECTION=true) |
touch .loki/STOP |
Stops immediately |
| Ctrl+C (once) | Pauses, shows options |
| Ctrl+C (twice) | Exits immediately |
DISABLED by default for enterprise security. Prompt injection via HUMAN_INPUT.md is blocked unless explicitly enabled.
# Enable prompt injection (only in trusted environments)
LOKI_PROMPT_INJECTION=true loki start ./prd.md
# Or for sandbox mode
LOKI_PROMPT_INJECTION=true loki sandbox prompt "start the app"| Type | File | Behavior |
|---|---|---|
| Directive | .loki/HUMAN_INPUT.md |
Active instruction (requires LOKI_PROMPT_INJECTION=true) |
Example directive (only works with LOKI_PROMPT_INJECTION=true):
echo "Check all .astro files for missing BaseLayout imports." > .loki/HUMAN_INPUT.mdAuto-detected or force with LOKI_COMPLEXITY:
| Tier | Phases | When Used |
|---|---|---|
| simple | 3 | 1-2 files, UI fixes, text changes |
| standard | 6 | 3-10 files, features, bug fixes |
| complex | 8 | 10+ files, microservices, external integrations |
Opt-in integration with Claude Managed Agents (released Apr 2026). Gives Loki cross-project audited memory and real multiagent councils. Features are BAKED INTO existing RARV-C and council flows -- no new commands to learn.
All flags default false. Default behavior is identical to v7.2.0.
| Flag | Purpose | Status |
|---|---|---|
LOKI_MANAGED_AGENTS |
Parent gate; required for every managed path | stable |
LOKI_MANAGED_MEMORY |
REASON augment + REFLECT shadow-write from .loki/memory/ to Managed Agents store |
stable (tested with fakes) |
LOKI_MANAGED_MEMORY_HYDRATE |
Session-boot pull of semantic patterns + skills from store | stable (tested with fakes) |
LOKI_EXPERIMENTAL_MANAGED_AGENTS |
Umbrella for multiagent session path | RESEARCH PREVIEW |
LOKI_EXPERIMENTAL_MANAGED_REVIEW |
Managed code-review council via callable_agents |
RESEARCH PREVIEW |
LOKI_EXPERIMENTAL_MANAGED_COUNCIL |
Managed completion council via callable_agents |
RESEARCH PREVIEW |
Fail-fast: child-on + parent-off exits 2 with clear error. API
unreachable falls back to local path with a managed_agents_fallback
event to .loki/managed/events.ndjson. No retry storm.
Flip-on order (recommended):
LOKI_MANAGED_AGENTS=true LOKI_MANAGED_MEMORY=true(memory mirror).- Add
LOKI_MANAGED_MEMORY_HYDRATE=trueafter one-week soak. - Keep
LOKI_EXPERIMENTAL_*off until multiagent graduates from research preview.
NOT TESTED against live Anthropic API. Automated CI uses
memory/managed_memory/fakes.py. Beta header pinned to
managed-agents-2026-04-01. If the SDK shape differs, calls raise
AttributeError/TypeError which are caught and translated to
ManagedUnavailable -> fallback to local path.
See skills/memory.md for the full integration guide.
The current track wires real evidence into RARV-C feedback. Documented here and in loki internal --help:
| Env Var | Effect |
|---|---|
LOKI_INJECT_FINDINGS=true |
Injects council findings + gate failures into the next REASON prompt |
LOKI_OVERRIDE_COUNCIL=true |
Promotes real provider judges over fakes when available |
LOKI_AUTO_LEARNINGS=true |
Auto-extracts learnings into semantic memory after VERIFY |
LOKI_HANDOFF_MD=true |
Emits a handoff.md continuity doc at session boundaries |
See references/core-workflow.md for the full RARV-C contract.
Three back-to-back patches closed cross-process and security gaps. No user-facing behavior change on the default flow; verify via the cited paths.
- Cross-process file locks on append-or-rewrite state, so parallel runs / dashboard / MCP do not corrupt shared files: gate counter (
autonomy/run.shgate-counter writes), task queues (autonomy/run.shqueue read-modify-write), checkpoint index (autonomy/run.shcheckpoint index updates),events.jsonlappend (event emission paths inevents/emit.shandautonomy/run.sh), human intervention signal files (autonomy/run.sh:check_human_intervention()at line ~8059 / 7897 per state-machine doc). - MCP path validation -- file/path arguments to
mcp/server.pytools are normalized and rejected if they escape the project root (path-traversal fix from v7.5.8). - Dashboard auth now required on
/api/memory/*,/api/learning/*, and/api/statusindashboard/server.py(previously unauthenticated read paths). - Bash quoting hardening across
autonomy/run.shandautonomy/loki-- variable expansions inside command substitution and[ ]tests quoted to prevent word-splitting on paths with spaces.
See CHANGELOG.md entries [7.5.7], [7.5.8], [7.5.13] for the per-fix list and reviewer sign-off.
| Feature | Added | Notes |
|---|---|---|
| Multi-provider support (5 providers) | v5.0.0 | claude, codex, gemini, cline, aider -- see providers/ |
| CONTINUITY.md working memory | v5.35.0 | Auto-managed by run.sh, updated each iteration |
| Quality gates 3-reviewer system | v5.35.0 | 5 specialist reviewers in skills/quality-gates.md; execution in run.sh |
| Memory System (episodic/semantic/procedural) | v5.15.0 | Full implementation in memory/ |
| Context Window Tracking | v5.40.0 | Dashboard gauge, per-agent breakdown at GET /api/context |
| Notification Triggers | v5.40.0 | GET/PUT /api/notifications/triggers |
| GitHub integration | v5.42.2 | Import, sync-back, PR creation, export. CLI: loki github, API: /api/github/* |
| Legacy System Healing | v6.67.0 | loki heal <path> -- friction-as-semantics, characterization tests |
Unified loki start |
v6.84.0 | Auto-detects spec (PRD, OpenAPI, etc.) vs issue input |
| Managed Agents (memory mirror) | v7.2.0 | Opt-in via LOKI_MANAGED_AGENTS -- see Managed Agents section |
| Bun runtime (Phase 1) | v7.3.0 | Read-only commands routed through bin/loki; LOKI_LEGACY_BASH=1 to revert |
| Phase 1 RARV-C closure | v7.5.x | Findings injection, real judges, auto-learnings, handoff.md |
| Feature | Target | Notes |
|---|---|---|
| Bun runtime (Phase 2+) | TBD | Migrate write-path commands; tracked on feat/bun-migration |
| Managed Agents multiagent path | TBD | LOKI_EXPERIMENTAL_MANAGED_* flags -- RESEARCH PREVIEW, not on live API |
| Benchmarks (HumanEval, SWE-bench) | TBD | Runner scripts and datasets exist in benchmarks/; no published results |
loki run removal |
next major | Currently a deprecated alias for loki start |
| Item | Deprecated In | Notes |
|---|---|---|
loki run <issue> |
v6.84.0 | Alias for loki start. Will be removed in next major. |
VSCode extension (vscode-extension/) |
v7.2.0 | No longer actively maintained; dashboard web UI is the supported front-end. |
v7.5.17 | Autonomi flagship product | ~260 lines core