Djinn — Agent Contract

Audience: Claude. Djinn pins the original session intent, watches for long-horizon drift across /compact, and reasserts the goal when the agent diverges. Answers the question: "Am I still working on what you asked?"

Shared behavioral modules

These apply to every skill in every plugin. Load once; do not re-derive.

@../vis/packages/core/conduct/discipline.md — coding conduct: think-first, simplicity, surgical edits, goal-driven loops
@../vis/packages/core/conduct/capability-fidelity.md — contracts survive capability gaps: recover, escalate, or abort; never silently substitute
@../vis/packages/core/conduct/doubt-engine.md — adversarial self-check before agreement; the active counter to F01 sycophancy
@../vis/packages/core/conduct/context.md — attention-budget hygiene, U-curve placement, checkpoint protocol
@../vis/packages/core/conduct/verification.md — independent checks, baseline snapshots, dry-run for destructive ops
@../vis/packages/core/conduct/verdict-calibration.md — every verdict (DEPLOY/PASS/COMPLETE/VERIFIED) carries n, sampling method, and a calibration qualifier; vis-side abstraction over the wixie DEPLOY bar
@../vis/packages/core/conduct/delegation.md — subagent contracts, tool whitelisting, parallel vs. serial rules
@../vis/packages/core/conduct/failure-modes.md — 14-code taxonomy for accumulated-learning logs
@../vis/packages/core/conduct/tool-use.md — tool-choice hygiene, error payload contract, parallel-dispatch rules
@../vis/packages/skills/conduct/formatting.md — per-target format (XML/Markdown/minimal/few-shot), prefill + stop sequences
@../vis/packages/skills/conduct/skill-authoring.md — SKILL.md frontmatter discipline, discovery test
@../vis/packages/core/conduct/hooks.md — advisory-only hooks, injection over denial, fail-open
@../vis/packages/core/conduct/metacognition.md — periodic goal-restate; fires every K=8 tool-uses or on user meta-question
@../vis/packages/core/conduct/precedent.md — log self-observed failures to state/precedent-log.md; consult before risky steps
@../vis/packages/core/conduct/precedent-freshness.md — verify self-authored memory/precedent/briefings before relying on them: Class-A surfaces (path/function/flag) get a Glob/Grep existence check; Class-B snapshots get a git-log freshness check; Class-C feedback rules are trusted unless contradicted
@../vis/packages/core/conduct/prior-art-discovery.md — F28 counter: run the 5-target discovery pass (shared/scripts, packages/*/skills, state/proposals, slug-glob, signature-grep) before authoring a new tool/script/skill/module
@../vis/packages/core/conduct/reversibility-foresight.md — classify action reversibility (trivial/costly/impossible) before acting; confirmation scales with tier
@../vis/packages/core/conduct/substrate-consumption.md — read-side complement to precedent.md: consume briefing, MEMORY, learnings, and precedent before acting; counter to F24 substrate-blindness
@../vis/packages/core/conduct/sunk-cost-iteration.md — stop-and-re-ask after 2 INCONCLUSIVE/BLOCKED results on the same artifact; iteration is not an authorization to keep patching
@../vis/packages/core/conduct/tier-sizing.md — agent-tier budget allocation per task class
@../vis/packages/web/conduct/web-fetch.md — external-URL-handling hygiene

When a module conflicts with a plugin-local instruction, the plugin wins — but log the override.

Lifecycle

Djinn is hook-driven with two skill-invoked sub-plugins. Four hook bindings publish four events on the djinn.* namespace. Intent lives in out-of-context state (plugins/intent-anchor/state/anchor.json), not in repeated in-context reminders — that is how we survive the recall-valley failure that plagues LangChain-style memory buffers.

Event / Skill	Sub-plugin	Role
SessionStart	`intent-anchor`	Capture first-turn intent; write `anchor.json`; publish `djinn.intent.captured`
UserPromptSubmit	`intent-anchor`	Refresh anchor when new constraints appear (LCS ratio < 0.5); publish `djinn.intent.refreshed`
PostToolUse	`drift-aligner`	Per-turn D1+D2+D3 alignment; stderr advisory + `djinn.drift.detected` when preservation < 0.7 and N ≥ 5
PreCompact	`compact-guard`	Inject anchor as structural hint; publish `djinn.compact.intent-hint.injected`
PreCompact	`drift-learning`	Fold session statistics into (intent-type × developer) posterior via D5 Gauss Accumulation
`/rank`	`utterance-rank`	On-demand D4 PageRank over the session utterance DAG
`/reorient`	`intent-reorient`	Manual re-pin of the session anchor

Matchers in ./plugins/<name>/hooks/hooks.json. Agents in ./plugins/<name>/agents/.

Algorithms

D1 Hunt-Szymanski LCS · D2 Baum-Welch HMM · D3 Vitter Reservoir · D4 Brin-Page PageRank · D5 Gauss Accumulation. Derivations in docs/science/README.md. Defining engine: D1.

ID	Name	Plugin	Algorithm	Reference
D1	Hunt-Szymanski LCS Alignment	intent-anchor + drift-aligner + compact-guard	LCS ratio over normalized tokens	Hunt and Szymanski (1977)
D2	Baum-Welch HMM Task-Boundary Inference	drift-aligner	3-state HMM (ON_TASK / SIDEQUEST / LOST), forward-backward + single-pass re-estimation	Baum and Welch (1970)
D3	Vitter Reservoir Sampling	intent-anchor + drift-aligner	Algorithm R, k=32	Vitter (1985)
D4	PageRank Utterance-DAG Ranking	utterance-rank	Sparse power-iteration over file-touch DAG	Brin and Page (1998)
D5	Gauss Accumulation — Intent-Type Drift Signature	drift-learning	EMA with 30-day half-life + sample-count tracking	Gauss (1809)

Behavioral contracts

Markers: [H] hook-enforced (deterministic) · [A] advisory (relies on your adherence).

IMPORTANT — Honest-numbers contract [H] Every advisory Djinn publishes carries (preservation_score, ci_low, ci_high, N). Advisories without all four are rejected by the Haiku validator. A score without a bootstrap band and a sample count is not a measurement — it is a guess. Non-parametric bootstrap (1000 iterations, stdlib random.choices) over the D3 reservoir is the only source of the band.
YOU MUST NOT repeat the anchor in-context [A] Intent lives in state/anchor.json, not in reinjected prompt text. The only structural reinjection point is PreCompact. Mid-context repetition lives in the recall valley (Liu et al. "Lost in the Middle", NAACL 2024) and buys zero recall. If you find yourself tempted to echo the anchor into a system message, stop.
YOU MUST NOT ask the agent [A] Djinn does NOT solicit the agent's own opinion on whether it has drifted. A drifted agent self-reports as on-task (Shinn et al., Reflexion 2023). Djinn measures with deterministic compute (D1 LCS + D2 HMM + D4 PageRank + D5 EMA) and reports; the agent's self-report is not in the signal path.

State paths

State file	Owner	Purpose
`plugins/intent-anchor/state/anchor.json`	intent-anchor	Session-intent anchor (captured once, refreshed on new constraints)
`plugins/drift-aligner/state/reservoir.json`	drift-aligner	Vitter reservoir of turn-score records (k=32)
`plugins/drift-aligner/state/states.jsonl`	drift-aligner	Append-only HMM observation log
`plugins/drift-learning/state/posteriors.json`	drift-learning	Per-(intent-type × developer) drift-signature posterior
`plugins/drift-learning/state/learnings.jsonl`	drift-learning	Per-session append-only summary log (backtesting source)
`plugins/utterance-rank/state/last-rank.json`	utterance-rank	Most recent /rank output

Agent tiers

Tier	Model	Agent	Used for
Orchestrator	Opus	`orchestrator`	Tipping-judgment: compose D1 + D2 + D4 + D5 into a drift verdict
Executor	Sonnet	`topic-tagger`	Gated semantic topic labeling when D1 < 0.7
Validator	Haiku	`aligner`	Per-turn deterministic shape-check — LCS + reservoir bookkeeping

Respect the tiering. Routing a Haiku validation task to Opus burns budget and breaks the cost contract.

Anti-patterns

Echoing the anchor as a mid-context reminder. Lives in the recall valley; buys nothing. Counter: anchor is out-of-context state, reinjected only at PreCompact.
Summary-based intent preservation. Lossy by construction; detail-specific intent survives the summary but rarely survives many summaries. Counter: D1 operates on the ORIGINAL anchor tokens, never a summary.
Retrieving "similar" turns. Semantic similarity ≠ intent preservation. Two drifted agents can mutually retrieve each other's drift as "relevant context". Counter: D4 ranks by DEMONSTRATED influence on output (file-touch overlap), not similarity.
Self-critique loops on drift. A drifted agent confidently self-reports as on-task. Counter: Djinn never asks the agent; deterministic compute only.
Inflating advisories without N. A preservation score without sample count is a guess. Counter: honest-numbers contract enforced by the Haiku validator; orchestrator returns insufficient_data when N < 5.
Collapsing Djinn with Emu. Emu owns A1 Markov on tool patterns (token-level drift); Djinn owns D1 LCS on goal-tokens (semantic intent drift). Orthogonal signals — do not merge.

Brand invariants (survive unchanged into every sibling)

Zero external runtime deps. Hooks: bash + jq only. Scripts: Python 3.8+ stdlib only. No npm/pip/cargo at runtime.
Managed agent tiers. Opus = orchestrator/judgment. Sonnet = executor/loops. Haiku = validator/format.
Named formal algorithm per engine. ID prefix letter + number + Author-Year citation in docstring.
Emu-style marketplace. Each sub-plugin ships .claude-plugin/plugin.json + {agents,commands,hooks,skills,state}/ + README.md.
Dark-themed PDF report. Produced by docs/architecture/generate.py on final release.
Gauss Accumulation learning. Per-session learnings at plugins/drift-learning/state/learnings.jsonl; posteriors at plugins/drift-learning/state/posteriors.json.
enchanted-mcp event bus. Inter-plugin coordination via published/subscribed events namespaced djinn.<event>.
Diagrams from source of truth. docs/architecture/generate.py reads plugin.json + hooks.json + SKILL.md frontmatter → writes mermaid diagrams.

Events this plugin publishes: djinn.intent.captured, djinn.drift.detected, djinn.compact.intent-hint.injected, djinn.intent.refreshed Events this plugin subscribes to (optional): emu.checkpoint.saved, crow.change.classified

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Djinn — Agent Contract

Shared behavioral modules

Lifecycle

Algorithms

Behavioral contracts

State paths

Agent tiers

Anti-patterns

Brand invariants (survive unchanged into every sibling)

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Djinn — Agent Contract

Shared behavioral modules

Lifecycle

Algorithms

Behavioral contracts

State paths

Agent tiers

Anti-patterns

Brand invariants (survive unchanged into every sibling)