An @enchanter-ai product — algorithm-driven, agent-managed, self-learning.
The context health platform that learns what wastes your tokens — and stops it.
3 plugins. 9 algorithms. 4 agents. Honest numbers.
40 minutes into a session, Emu told me Claude had been editing and reverting the same file for 12 minutes. I didn't notice. It did.
In plain English: Some sessions burn 200k tokens to ship twelve lines. Emu tells you while you can still steer out — not after the context dies and you start over.
Technically: A1 Markov Drift Detection classifies turn sequences into READ_LOOP / EDIT_REVERT / TEST_FAIL_LOOP patterns against a 5-turn cooldown window; A2 Linear Runway Forecasting estimates remaining turns to compaction with a ±CI band. Context savings are recorded per strategy in metrics.jsonl and accumulated cross-session via A7 Exponential Strategy Averaging — every advisory cites observed pattern, not inferred intent.
Emu takes its name from Alex's Mobs — a flightless bird that stands tall on the open plain and uses its long-range vision to spot threats on the horizon long before they arrive. Emu watches your session the same way: eyes on the token flow, flagging the runway edge before you hit it.
The question this plugin answers: What did I spend?
- Developers who've watched a session burn through context on re-reads, revert loops, or verbose tool output — and want an objective readout instead of a gut feeling.
- Teams that need honest numbers (runway with a ±CI, drift with a specific pattern) for retrospectives, not a marketing dashboard.
- Privacy-conscious users — everything Emu observes stays local; there is no outbound network code (see PRIVACY.md).
Not for:
- Centralized-observability teams who need a cloud dashboard — Emu is machine-local by design.
- Sessions where context burn is obviously the smaller problem than code correctness — reach for Crow or Lich first, then Emu.
- How It Works
- What Makes Emu Different
- The Full Lifecycle
- Install
- Quickstart
- 3 Plugins, 4 Agents, 9 Algorithms
- What You Get Per Session
- Roadmap
- The Science Behind Emu
- Commands
- Compression Rules (15)
- vs Everything Else
- Agent Conduct (11 Modules)
- Architecture
- Acknowledgments
- Versioning & release cadence
- Contributing
- Citation
- License
Emu splits into three plugins that each own one lifecycle phase. token-saver fires on PreToolUse to compress verbose Bash output (A3), block duplicate file reads (A5), and return deltas on changed re-reads (A6). context-guard fires on PostToolUse to forecast runway (A2) and detect drift patterns (A1). state-keeper fires on PreCompact to write an atomic checkpoint (A4). Across sessions, A7 accumulates per-strategy success rates. The diagram below shows this flow.
Source: docs/assets/pipeline.mmd · Regeneration command in docs/assets/README.md.
Three plugins. Three lifecycle phases. No overlap. No dependencies between plugins.
Catches Claude spinning in circles — in real time, not after the fact:
⚠️ Drift Alert: src/auth.ts read 4× without changes.
Claude may be stuck re-reading without progress.
→ Reframe the problem or /emu:checkpoint before /compact.
Three patterns: read loops, edit-revert cycles, test fail loops. 5-turn cooldown between alerts to avoid noise.
Not "43% context used." Not "$0.12 spent." Just: "~8 turns until compaction."
RUNWAY FORECAST (Algorithm A2: Linear Runway Forecasting)
Point estimate: ~14 turns remaining
95% CI: [8, 20] turns
Confidence: MEDIUM (CV=0.31)
Velocity: 4,200 tokens/turn avg (sigma=1,302)
See exactly where your tokens go:
TOOL ANALYTICS (this session)
Read: 42 calls, ~18,400 tokens (34%)
Bash: 28 calls, ~14,200 tokens (26%)
Write: 15 calls, ~11,800 tokens (22%)
Configurable terse mode that cuts output token waste without losing information. Four levels: off / lite / full / ultra. Code stays verbose — only prose gets lean.
Re-reading a changed file? Emu shows only what changed instead of the full file. Re-reading an unchanged file? Blocked — with a preview and elapsed time.
Emu accumulates strategy success rates across sessions. After each report, it logs which compression rules fired, which drift patterns recurred, and which interventions worked — then adjusts its internal model via exponential moving average.
/emu:report shows exact savings per feature, drift alerts fired, turns
remaining, and accumulated learnings. Conservative methodology. We don't inflate numbers.
Every turn cycles through the same path. Tool calls hit PreToolUse (token-saver), then execute, then hit PostToolUse (context-guard). When context approaches full, PreCompact fires and state-keeper writes checkpoint.md before the wipe. On resume, the restorer agent reads the checkpoint back and the session continues without manual re-briefing.
Source: docs/assets/lifecycle.mmd · Regeneration command in docs/assets/README.md.
Every tool call flows through the same pipeline. When context fills up, state-keeper saves a checkpoint before the wipe, and the restorer agent brings it back autonomously.
Emu ships as 3 plugins cooperating across PreToolUse / PostToolUse / PreCompact. One meta-plugin — full — lists all three as dependencies, so a single install pulls in the whole platform.
In Claude Code (recommended):
/plugin marketplace add enchanter-ai/emu
/plugin install full@emu
Claude Code resolves the dependency list and installs all 3 plugins. Verify with /plugin list.
Want to cherry-pick? Individual plugins are still installable by name — e.g. /plugin install emu-context-guard@emu if you only want the drift/runway dashboard. The three lifecycle phases are designed to cooperate, though, so full@emu is the path we recommend.
Via shell (also installs shared/*.sh locally so hooks work offline):
bash <(curl -s https://raw.githubusercontent.com/enchanter-ai/emu/main/install.sh)git clone https://github.com/enchanter-ai/emu
cd emu
./scripts/bootstrap.sh # canonical first command — installs vis siblingWithout ./scripts/bootstrap.sh, conduct imports will silently miss and Claude Code's @-loader will fail-soft. Always bootstrap first.
| Plugin | Hook | Command | Algorithms |
|---|---|---|---|
| state-keeper | PreCompact | /emu:checkpoint |
A4 |
| token-saver | PreToolUse + PostToolUse | — | A3, A5, A6 |
| context-guard | PostToolUse | /emu:report |
A1, A2, A8 |
| shared | — | — | A7, A9 |
| Agent | Model | Plugin | What |
|---|---|---|---|
| analyst | Haiku | context-guard | Background report generation |
| forecaster | Haiku | context-guard | Runway forecast with confidence interval |
| restorer | Haiku | state-keeper | Autonomous context restoration |
| compressor | Haiku | token-saver | Compression strategy analysis |
Tool calls write events to three plugin state directories. token-saver/state/metrics.jsonl records compressions, dedup blocks, and delta reads. context-guard/state/metrics.jsonl records per-turn token estimates and drift detections; learnings.json accumulates cross-session strategy rates (A7). state-keeper/state/ holds the latest checkpoint.md, any user-flagged remember.md, and checkpoint events. /emu:report reads all three plugins to produce the session dashboard.
Source: docs/assets/state-flow.mmd · Regeneration command in docs/assets/README.md.
state-keeper/state/
├── checkpoint.md # Pre-compaction snapshot (branch, files, instructions)
├── remember.md # User-flagged context (/emu:checkpoint items)
└── metrics.jsonl # checkpoint_saved events
token-saver/state/
└── metrics.jsonl # bash_compressed, duplicate_blocked, delta_read events
context-guard/state/
├── metrics.jsonl # turn events — now include "skill" field (A8)
├── skill-metrics.jsonl # A8 — rich per-skill events (only when a skill is active)
├── active-skills.json # A8 — live scope stack (invocation-id keyed)
└── .session # A9 — per-worktree session id (gitignored)
$XDG_STATE_HOME/emu/<repo_id>/ # A9 — cross-worktree global
└── skill-metrics-global.<pid>.jsonl # per-PID shard; readers glob + merge
$XDG_DATA_HOME/emu/<repo_id>/ # A9 — long-lived learnings
└── learnings.json # A7 strategy rates; migrated from local
Tracked in docs/ROADMAP.md and the shared ecosystem map. For upcoming work specific to Emu, see issues tagged roadmap.
Nine named algorithms. Each one referenced in code, agents, and reports.
Pattern-matching finite automaton over tool call sequences.
States: PRODUCTIVE, READ_LOOP, EDIT_REVERT, TEST_FAIL_LOOP.
Transitions on tool name + file hash + exit code.
5-turn cooldown between alerts.
Where θ = 3 (configurable via EMU_DRIFT_READ_THRESHOLD).
Estimates turns until compaction from a sliding window of token velocities.
Where C_max = 200,000 tokens and t̄_w is the windowed mean of recent turns.
Reduces output
15 pattern-matched rules for input compression. Extensions:
- Shannon Output Compression — prose terse mode (4 levels)
- Temporal Decay Compression — age-based result stubbing
Write-validate-rename protocol for checkpoint persistence.
50KB bound. Atomic mkdir locking (never flock).
SHA-256 hash + TTL cache for read deduplication.
TTL = 600s. Block unchanged, allow after expiry.
Extension of A5. Tracks when changed files are re-read within TTL and emits telemetry:
When a file is re-read after modification within the TTL window, the hook logs a delta_read event recording file path, full line count, and diff line count. Diff generation (unified format with context) is deferred to Phase 2; current implementation is telemetry-only.
Exponential moving average (EMA) over compression strategy success rates and drift frequencies across sessions.
Uses EMA with α=0.3 to blend current-session metrics into historical rates for compression rules, drift patterns, and velocity. Detects dormant rules, chronic drift patterns, and velocity trends. Persisted to learnings.json after each report.
Every tool call is attributed to the currently-active skill (or manual
if none is registered). Skills register a scope at entry, unregister at exit;
the stack supports nesting so a parent skill that invokes a child skill still
has correct parent/child lineage on every event.
Where EMU_SKILL_TTL).
Scopes are keyed by 16-hex-char invocation ids — not PIDs — so entries survive
PID reuse (systemd InvocationID pattern). Eviction on every read: stale
entries (dead PID or expired TTL) are purged before the "current" scope is returned.
Emitted as skill-metrics.jsonl alongside metrics.jsonl. /emu:analytics
surfaces the per-skill breakdown.
Concurrent Claude Code sessions across multiple git worktrees of the same repo are unified into one view by the root-commit hash:
The root commit is stable across clones, forks, renames, and worktree paths —
basename-of-toplevel is not. Cross-worktree events land in
$XDG\_STATE\_HOME/emu/\langle repo\_id\rangle/, sharded per-PID
(skill-metrics-global.\langle pid\rangle.jsonl) to avoid concurrent-append
interleaving on filesystems without atomicity guarantees (Windows, NFS).
Readers glob all shards and merge by ts:
/emu:report renders a WORKTREE OVERVIEW section when ≥ 2 worktrees have
written. /emu:report --global forces the unified view across every session
recorded in the global dir.
Learnings (A7) also migrate to $XDG_DATA_HOME/emu/<repo_id>/learnings.json —
the data dir per XDG spec — so cross-session accumulation survives cache wipes
and spans every worktree without symlinks.
| Command | Plugin | What |
|---|---|---|
/emu:report |
context-guard | Full session dashboard. --global for unified cross-worktree view (A9). |
/emu:runway |
context-guard | Quick turns-until-compaction check |
/emu:analytics |
context-guard | Per-tool + per-skill token breakdown (A8) |
/emu:doctor |
context-guard | Diagnostic self-check for all plugins |
/emu:checkpoint [text] |
state-keeper | Save context that survives compaction |
/emu:checkpoint-show |
state-keeper | Display most recent automatic checkpoint |
| Pattern | Action |
|---|---|
| npm/yarn/pnpm test, vitest, jest | tail -n 40 |
| pytest, python -m unittest | filter pass/fail summary |
| go test | filter PASS/FAIL lines |
| mvn/gradle test | filter BUILD + test summary |
| dotnet build/test | filter pass/fail summary |
| npm/yarn/pnpm install | filter errors/warnings |
| cargo build/test | filter errors/warnings |
| make | filter errors or "Build succeeded" |
| docker build | filter layer summaries + image ID |
| terraform plan | filter Plan summary |
| eslint | filter error count + first errors |
| tsc | filter TS errors |
| git log (verbose) | --oneline -20 |
| find (no head) | head -n 30 |
| cat (>100 lines) | head -n 80 + line count |
Bypass: prefix with FULL: to skip compression.
| Emu | Caveman | Cozempic | context-mode | token-optimizer | |
|---|---|---|---|---|---|
| Drift detection | real-time, 3 patterns | — | — | — | — |
| Turn forecast | Runway + 95% CI | — | threshold only | — | — |
| Output reduction | 4 modes | 65% prose cut | — | — | — |
| Input compression | 15 rules | — | 18 strategies | — | — |
| Delta mode | diff on re-read | — | — | — | delta mode |
| Per-skill + per-tool analytics | /emu:analytics (A8) | — | — | per-tool only | waste dashboard |
| Cross-worktree unified view | /emu:report --global (A9) | — | — | — | — |
| Tool result aging | age-based alerts | — | 3-tier stubbing | — | — |
| Savings proof | /emu:report | — | session report | ctx_stats | quality score |
| Compaction survival | checkpoint.md | — | team state | SQLite | checkpoints |
| Self-learning | learnings.json | — | — | — | — |
| Agents | 4 (Haiku) | — | — | — | — |
| Dependencies | bash + jq | — | Python | Node.js + MCP | Node.js |
Combined: 30-45% token reduction. Not 70%. Honest numbers. Plus the only tool that catches Claude going in circles — and learns from it.
Every skill inherits a reusable behavioral contract from shared/ — loaded once into CLAUDE.md, applied across all plugins. This is how Claude acts inside Emu: deterministic, surgical, verifiable. Not a suggestion; a contract.
| Module | What it governs |
|---|---|
| discipline.md | Coding conduct: think-first, simplicity, surgical edits, goal-driven loops |
| context.md | Attention-budget hygiene, U-curve placement, checkpoint protocol |
| verification.md | Independent checks, baseline snapshots, dry-run for destructive ops |
| delegation.md | Subagent contracts, tool whitelisting, parallel vs. serial rules |
| failure-modes.md | 14-code taxonomy for accumulated-learning logs |
| tool-use.md | Tool-choice hygiene, error payload contract, parallel-dispatch rules |
| skill-authoring.md | SKILL.md frontmatter discipline, discovery test |
| hooks.md | Advisory-only hooks, injection over denial, fail-open |
| precedent.md | Log self-observed failures to state/precedent-log.md; consult before risky steps |
| tier-sizing.md | Prompt verbosity scales inversely with model tier; Haiku needs mechanical steps, Opus runs on intent |
| web-fetch.md | External URL handling: cache, dedup, budget; WebFetch is Haiku-tier-only |
Full interactive architecture explorer with 4 tabbed diagrams and plugin component cards:
docs/architecture/ — auto-generated from the codebase. Run python docs/architecture/generate.py to regenerate.
Emu builds on substrate laid by others:
- Claude Code (Anthropic) — the plugin surface this work extends.
- Keep a Changelog — CHANGELOG convention.
- Semantic Versioning — versioning contract.
- Contributor Covenant — Code of Conduct.
- repostatus.org — status badge.
- Citation File Format — citation metadata.
- Conventional Commits — commit convention.
Emu follows Semantic Versioning. Breaking changes land on major bumps only; the CHANGELOG flags them explicitly. Release cadence is opportunistic — tags land when accumulated fixes or features justify a cut, not on a fixed schedule. Migration notes between majors live in docs/upgrading.md.
See CONTRIBUTING.md
If you use this project in research or derivative work, please cite it:
@software{emu_2026,
title = {Emu},
author = {{Klaiderman}},
year = {2026},
url = {https://github.com/enchanter-ai/emu}
}See CITATION.cff for additional formats (APA, MLA, EndNote).
MIT
Emu is the session-health layer — it watches the token economy of every Claude Code session. Upstream, Wixie's prompts arrive through the conversation and Emu measures them; tool-call output flows through the same observation path. Downstream, Pech reads Emu's per-turn token accounting and attributes it across plugin × sub-plugin × agent tier × model for forecast and budget purposes.
Emu does not engineer prompts (Wixie's lane), score change trust (Crow's lane), review code correctness (Lich's lane), enforce budget gates via kill-switches (Pech uses cooperative degradation, not pre-emption), or scan security surfaces (Hydra's lane). It observes token burn and keeps the session recoverable across compaction.
See ../wixie/docs/ecosystem.md § Data Flow Between Plugins for the full map.
