Skip to content

enchanter-ai/emu

Emu

Emu mascot

License: MIT 3 plugins 9 algorithms 4 agents Honest numbers contract Project Status: Active

An @enchanter-ai product — algorithm-driven, agent-managed, self-learning.

The context health platform that learns what wastes your tokens — and stops it.

3 plugins. 9 algorithms. 4 agents. Honest numbers.

40 minutes into a session, Emu told me Claude had been editing and reverting the same file for 12 minutes. I didn't notice. It did.

TL;DR

In plain English: Some sessions burn 200k tokens to ship twelve lines. Emu tells you while you can still steer out — not after the context dies and you start over.

Technically: A1 Markov Drift Detection classifies turn sequences into READ_LOOP / EDIT_REVERT / TEST_FAIL_LOOP patterns against a 5-turn cooldown window; A2 Linear Runway Forecasting estimates remaining turns to compaction with a ±CI band. Context savings are recorded per strategy in metrics.jsonl and accumulated cross-session via A7 Exponential Strategy Averaging — every advisory cites observed pattern, not inferred intent.


Origin

Emu takes its name from Alex's Mobs — a flightless bird that stands tall on the open plain and uses its long-range vision to spot threats on the horizon long before they arrive. Emu watches your session the same way: eyes on the token flow, flagging the runway edge before you hit it.

The question this plugin answers: What did I spend?

Who this is for

  • Developers who've watched a session burn through context on re-reads, revert loops, or verbose tool output — and want an objective readout instead of a gut feeling.
  • Teams that need honest numbers (runway with a ±CI, drift with a specific pattern) for retrospectives, not a marketing dashboard.
  • Privacy-conscious users — everything Emu observes stays local; there is no outbound network code (see PRIVACY.md).

Not for:

  • Centralized-observability teams who need a cloud dashboard — Emu is machine-local by design.
  • Sessions where context burn is obviously the smaller problem than code correctness — reach for Crow or Lich first, then Emu.

Contents

How It Works

Emu splits into three plugins that each own one lifecycle phase. token-saver fires on PreToolUse to compress verbose Bash output (A3), block duplicate file reads (A5), and return deltas on changed re-reads (A6). context-guard fires on PostToolUse to forecast runway (A2) and detect drift patterns (A1). state-keeper fires on PreCompact to write an atomic checkpoint (A4). Across sessions, A7 accumulates per-strategy success rates. The diagram below shows this flow.

Emu hook bindings: Claude Code tool calls fan out into token-saver (PreToolUse · A3/A5/A6), context-guard (PostToolUse · A1/A2), state-keeper (PreCompact · A4); drift/metric events feed A7 Bayesian cross-session accumulation

Source: docs/assets/pipeline.mmd · Regeneration command in docs/assets/README.md.

Three plugins. Three lifecycle phases. No overlap. No dependencies between plugins.

What Makes Emu Different

Drift Alert

Catches Claude spinning in circles — in real time, not after the fact:

⚠️ Drift Alert: src/auth.ts read 4× without changes.
Claude may be stuck re-reading without progress.
→ Reframe the problem or /emu:checkpoint before /compact.

Three patterns: read loops, edit-revert cycles, test fail loops. 5-turn cooldown between alerts to avoid noise.

Token Runway

Not "43% context used." Not "$0.12 spent." Just: "~8 turns until compaction."

RUNWAY FORECAST (Algorithm A2: Linear Runway Forecasting)

Point estimate:  ~14 turns remaining
95% CI:          [8, 20] turns
Confidence:      MEDIUM (CV=0.31)
Velocity:        4,200 tokens/turn avg (sigma=1,302)

Per-Tool Analytics

See exactly where your tokens go:

TOOL ANALYTICS (this session)
  Read:    42 calls, ~18,400 tokens (34%)
  Bash:    28 calls, ~14,200 tokens (26%)
  Write:   15 calls, ~11,800 tokens (22%)

Output Efficiency

Configurable terse mode that cuts output token waste without losing information. Four levels: off / lite / full / ultra. Code stays verbose — only prose gets lean.

Delta Mode

Re-reading a changed file? Emu shows only what changed instead of the full file. Re-reading an unchanged file? Blocked — with a preview and elapsed time.

Self-Learning

Emu accumulates strategy success rates across sessions. After each report, it logs which compression rules fired, which drift patterns recurred, and which interventions worked — then adjusts its internal model via exponential moving average.

The Receipt

/emu:report shows exact savings per feature, drift alerts fired, turns remaining, and accumulated learnings. Conservative methodology. We don't inflate numbers.

The Full Lifecycle

Every turn cycles through the same path. Tool calls hit PreToolUse (token-saver), then execute, then hit PostToolUse (context-guard). When context approaches full, PreCompact fires and state-keeper writes checkpoint.md before the wipe. On resume, the restorer agent reads the checkpoint back and the session continues without manual re-briefing.

Emu session lifecycle: session start, turn N tool call, PreToolUse (token-saver) compresses, tool executes, PostToolUse (context-guard) detects drift / forecasts runway, loop continues until C(t) ≥ C_max triggers compaction; PreCompact (state-keeper) writes checkpoint.md; context wiped; restorer agent reads checkpoint; session continues

Source: docs/assets/lifecycle.mmd · Regeneration command in docs/assets/README.md.

Every tool call flows through the same pipeline. When context fills up, state-keeper saves a checkpoint before the wipe, and the restorer agent brings it back autonomously.

Install

Emu ships as 3 plugins cooperating across PreToolUse / PostToolUse / PreCompact. One meta-plugin — full — lists all three as dependencies, so a single install pulls in the whole platform.

In Claude Code (recommended):

/plugin marketplace add enchanter-ai/emu
/plugin install full@emu

Claude Code resolves the dependency list and installs all 3 plugins. Verify with /plugin list.

Want to cherry-pick? Individual plugins are still installable by name — e.g. /plugin install emu-context-guard@emu if you only want the drift/runway dashboard. The three lifecycle phases are designed to cooperate, though, so full@emu is the path we recommend.

Via shell (also installs shared/*.sh locally so hooks work offline):

bash <(curl -s https://raw.githubusercontent.com/enchanter-ai/emu/main/install.sh)

Quickstart

git clone https://github.com/enchanter-ai/emu
cd emu
./scripts/bootstrap.sh    # canonical first command — installs vis sibling

Without ./scripts/bootstrap.sh, conduct imports will silently miss and Claude Code's @-loader will fail-soft. Always bootstrap first.

3 Plugins, 4 Agents, 9 Algorithms

Plugin Hook Command Algorithms
state-keeper PreCompact /emu:checkpoint A4
token-saver PreToolUse + PostToolUse A3, A5, A6
context-guard PostToolUse /emu:report A1, A2, A8
shared A7, A9
Agent Model Plugin What
analyst Haiku context-guard Background report generation
forecaster Haiku context-guard Runway forecast with confidence interval
restorer Haiku state-keeper Autonomous context restoration
compressor Haiku token-saver Compression strategy analysis

What You Get Per Session

Tool calls write events to three plugin state directories. token-saver/state/metrics.jsonl records compressions, dedup blocks, and delta reads. context-guard/state/metrics.jsonl records per-turn token estimates and drift detections; learnings.json accumulates cross-session strategy rates (A7). state-keeper/state/ holds the latest checkpoint.md, any user-flagged remember.md, and checkpoint events. /emu:report reads all three plugins to produce the session dashboard.

Emu per-session state flow: tool calls (Bash, Read, Write, Glob/Grep) append events to three JSONL journals (token-saver, context-guard, state-keeper), which are merged by /emu:report into a session dashboard; A7 learnings accumulate across sessions in learnings.json

Source: docs/assets/state-flow.mmd · Regeneration command in docs/assets/README.md.

state-keeper/state/
├── checkpoint.md        # Pre-compaction snapshot (branch, files, instructions)
├── remember.md          # User-flagged context (/emu:checkpoint items)
└── metrics.jsonl        # checkpoint_saved events

token-saver/state/
└── metrics.jsonl        # bash_compressed, duplicate_blocked, delta_read events

context-guard/state/
├── metrics.jsonl        # turn events — now include "skill" field (A8)
├── skill-metrics.jsonl  # A8 — rich per-skill events (only when a skill is active)
├── active-skills.json   # A8 — live scope stack (invocation-id keyed)
└── .session             # A9 — per-worktree session id (gitignored)

$XDG_STATE_HOME/emu/<repo_id>/       # A9 — cross-worktree global
└── skill-metrics-global.<pid>.jsonl   # per-PID shard; readers glob + merge

$XDG_DATA_HOME/emu/<repo_id>/        # A9 — long-lived learnings
└── learnings.json                     # A7 strategy rates; migrated from local

Roadmap

Tracked in docs/ROADMAP.md and the shared ecosystem map. For upcoming work specific to Emu, see issues tagged roadmap.

The Science Behind Emu

Nine named algorithms. Each one referenced in code, agents, and reports.

A1. Markov Drift Detection

Pattern-matching finite automaton over tool call sequences.

States: PRODUCTIVE, READ_LOOP, EDIT_REVERT, TEST_FAIL_LOOP. Transitions on tool name + file hash + exit code. 5-turn cooldown between alerts.

P(drift | s1, ..., sn) = 1 if count of repeated states >= theta; else 0

Where θ = 3 (configurable via EMU_DRIFT_READ_THRESHOLD).

A2. Linear Runway Forecasting

Estimates turns until compaction from a sliding window of token velocities.

R_hat = (C_max - sum t_i) / t_bar_w; 95% CI = R_hat ± 1.96 · sigma_t / t_bar_w · R_hat

Where C_max = 200,000 tokens and t̄_w is the windowed mean of recent turns.

A3. Shannon Compression

Reduces output $O$ to $O'$ preserving information density above threshold $\theta$:

H(O') >= theta · H(O); theta = 1.0 code, 0.7 tests, 0.3 logs

15 pattern-matched rules for input compression. Extensions:

  • Shannon Output Compression — prose terse mode (4 levels)
  • Temporal Decay Compression — age-based result stubbing

A4. Atomic State Serialization

Write-validate-rename protocol for checkpoint persistence.

write(tmp) -> validate(tmp) -> rename(tmp, target)

50KB bound. Atomic mkdir locking (never flock).

A5. Content-Addressable Dedup

SHA-256 hash + TTL cache for read deduplication.

decision(f) = BLOCK if hash matches cache and Δt < TTL; ALLOW if Δt >= TTL

TTL = 600s. Block unchanged, allow after expiry.

A6. Delta-Read Telemetry

Extension of A5. Tracks when changed files are re-read within TTL and emits telemetry:

event: delta_read logged when hash differs from cache and Δt < TTL

When a file is re-read after modification within the TTL window, the hook logs a delta_read event recording file path, full line count, and diff line count. Diff generation (unified format with context) is deferred to Phase 2; current implementation is telemetry-only.

A7. Exponential Strategy Averaging

Exponential moving average (EMA) over compression strategy success rates and drift frequencies across sessions.

r_new = alpha · s_current + (1 - alpha) · r_prior; alpha = 0.3

Uses EMA with α=0.3 to blend current-session metrics into historical rates for compression rules, drift patterns, and velocity. Detects dormant rules, chronic drift patterns, and velocity trends. Persisted to learnings.json after each report.

A8. Skill-Scoped Attribution

Every tool call is attributed to the currently-active skill (or manual if none is registered). Skills register a scope at entry, unregister at exit; the stack supports nesting so a parent skill that invokes a child skill still has correct parent/child lineage on every event.

attr(c) = top-of-stack skill if any alive and within TTL; otherwise 'manual'

Where $S$ is the stack of active skills (LIFO), $s_{\text{top}}$ is the most recent, and $\text{TTL} = 3600\text{s}$ (configurable via EMU_SKILL_TTL). Scopes are keyed by 16-hex-char invocation ids — not PIDs — so entries survive PID reuse (systemd InvocationID pattern). Eviction on every read: stale entries (dead PID or expired TTL) are purged before the "current" scope is returned.

Emitted as skill-metrics.jsonl alongside metrics.jsonl. /emu:analytics surfaces the per-skill breakdown.

A9. Worktree Session Graph

Concurrent Claude Code sessions across multiple git worktrees of the same repo are unified into one view by the root-commit hash:

repo_id = first 12 chars of sha256(c_0); c_0 = git rev-list --max-parents=0 HEAD

The root commit is stable across clones, forks, renames, and worktree paths — basename-of-toplevel is not. Cross-worktree events land in $XDG\_STATE\_HOME/emu/\langle repo\_id\rangle/, sharded per-PID (skill-metrics-global.\langle pid\rangle.jsonl) to avoid concurrent-append interleaving on filesystems without atomicity guarantees (Windows, NFS). Readers glob all shards and merge by ts:

unified_session = union of shards across all worktrees of the repo_id

/emu:report renders a WORKTREE OVERVIEW section when ≥ 2 worktrees have written. /emu:report --global forces the unified view across every session recorded in the global dir.

Learnings (A7) also migrate to $XDG_DATA_HOME/emu/<repo_id>/learnings.json — the data dir per XDG spec — so cross-session accumulation survives cache wipes and spans every worktree without symlinks.

Commands

Command Plugin What
/emu:report context-guard Full session dashboard. --global for unified cross-worktree view (A9).
/emu:runway context-guard Quick turns-until-compaction check
/emu:analytics context-guard Per-tool + per-skill token breakdown (A8)
/emu:doctor context-guard Diagnostic self-check for all plugins
/emu:checkpoint [text] state-keeper Save context that survives compaction
/emu:checkpoint-show state-keeper Display most recent automatic checkpoint

Compression Rules (15)

Pattern Action
npm/yarn/pnpm test, vitest, jest tail -n 40
pytest, python -m unittest filter pass/fail summary
go test filter PASS/FAIL lines
mvn/gradle test filter BUILD + test summary
dotnet build/test filter pass/fail summary
npm/yarn/pnpm install filter errors/warnings
cargo build/test filter errors/warnings
make filter errors or "Build succeeded"
docker build filter layer summaries + image ID
terraform plan filter Plan summary
eslint filter error count + first errors
tsc filter TS errors
git log (verbose) --oneline -20
find (no head) head -n 30
cat (>100 lines) head -n 80 + line count

Bypass: prefix with FULL: to skip compression.

vs Everything Else

Emu Caveman Cozempic context-mode token-optimizer
Drift detection real-time, 3 patterns
Turn forecast Runway + 95% CI threshold only
Output reduction 4 modes 65% prose cut
Input compression 15 rules 18 strategies
Delta mode diff on re-read delta mode
Per-skill + per-tool analytics /emu:analytics (A8) per-tool only waste dashboard
Cross-worktree unified view /emu:report --global (A9)
Tool result aging age-based alerts 3-tier stubbing
Savings proof /emu:report session report ctx_stats quality score
Compaction survival checkpoint.md team state SQLite checkpoints
Self-learning learnings.json
Agents 4 (Haiku)
Dependencies bash + jq Python Node.js + MCP Node.js

Combined: 30-45% token reduction. Not 70%. Honest numbers. Plus the only tool that catches Claude going in circles — and learns from it.

Agent Conduct (11 Modules)

Every skill inherits a reusable behavioral contract from shared/ — loaded once into CLAUDE.md, applied across all plugins. This is how Claude acts inside Emu: deterministic, surgical, verifiable. Not a suggestion; a contract.

Module What it governs
discipline.md Coding conduct: think-first, simplicity, surgical edits, goal-driven loops
context.md Attention-budget hygiene, U-curve placement, checkpoint protocol
verification.md Independent checks, baseline snapshots, dry-run for destructive ops
delegation.md Subagent contracts, tool whitelisting, parallel vs. serial rules
failure-modes.md 14-code taxonomy for accumulated-learning logs
tool-use.md Tool-choice hygiene, error payload contract, parallel-dispatch rules
skill-authoring.md SKILL.md frontmatter discipline, discovery test
hooks.md Advisory-only hooks, injection over denial, fail-open
precedent.md Log self-observed failures to state/precedent-log.md; consult before risky steps
tier-sizing.md Prompt verbosity scales inversely with model tier; Haiku needs mechanical steps, Opus runs on intent
web-fetch.md External URL handling: cache, dedup, budget; WebFetch is Haiku-tier-only

Architecture

Full interactive architecture explorer with 4 tabbed diagrams and plugin component cards:

docs/architecture/ — auto-generated from the codebase. Run python docs/architecture/generate.py to regenerate.

Acknowledgments

Emu builds on substrate laid by others:

Versioning & release cadence

Emu follows Semantic Versioning. Breaking changes land on major bumps only; the CHANGELOG flags them explicitly. Release cadence is opportunistic — tags land when accumulated fixes or features justify a cut, not on a fixed schedule. Migration notes between majors live in docs/upgrading.md.

Contributing

See CONTRIBUTING.md

Citation

If you use this project in research or derivative work, please cite it:

@software{emu_2026,
  title = {Emu},
  author = {{Klaiderman}},
  year = {2026},
  url = {https://github.com/enchanter-ai/emu}
}

See CITATION.cff for additional formats (APA, MLA, EndNote).

License

MIT


Role in the ecosystem

Emu is the session-health layer — it watches the token economy of every Claude Code session. Upstream, Wixie's prompts arrive through the conversation and Emu measures them; tool-call output flows through the same observation path. Downstream, Pech reads Emu's per-turn token accounting and attributes it across plugin × sub-plugin × agent tier × model for forecast and budget purposes.

Emu does not engineer prompts (Wixie's lane), score change trust (Crow's lane), review code correctness (Lich's lane), enforce budget gates via kill-switches (Pech uses cooperative degradation, not pre-emption), or scan security surfaces (Hydra's lane). It observes token burn and keeps the session recoverable across compaction.

See ../wixie/docs/ecosystem.md § Data Flow Between Plugins for the full map.

About

Stop burning API tokens. 9 algorithmic engines for real-time context optimization, infinite-loop detection, and smart prompt compression.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors