How the four moving parts of vis compose. Read once; reference when extending.
The repo is a packages-monorepo: each of the four layers below is sliced across packages/{core,safety,orchestration,web,skills,cost,memory}/ so adopters can pull just the area they need. The four-layer split is the conceptual frame; package boundaries are the physical layout.
┌──────────────────────────────────────────────────────────────────┐
│ vis │
└──────────────────────────────────────────────────────────────────┘
│
┌──────────────────┬────┴────┬──────────────────┐
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐
│ conduct │ │ engines │ │ taxonomy │ │ recipes │
│ (rules) │ │ (math) │ │ (codes) │ │ (hosts) │
└─────────┘ └─────────┘ └──────────┘ └─────────┘
│ │ │ │
│ "what to do" │ "what │ "what failed" │ "where to
│ │ to │ │ drop it"
│ │ compute" │ │
| Layer | Where it lives | Role |
|---|---|---|
| conduct | packages/{core,skills,orchestration,safety,web,memory,cost}/conduct/ (36 rule modules) |
What the agent should do or avoid |
| engines | packages/orchestration/engines/ (12 algorithm docs) |
What the agent should compute |
| taxonomy | packages/core/taxonomy/ (F01–F14 + axes.md), packages/safety/taxonomy/ (F15–F21) |
How the agent names what went wrong |
| recipes | packages/skills/recipes/ (8 host guides) + packages/cost/recipes/ (eval-harnesses) |
How a project picks up the framework |
| runbooks | packages/core/runbooks/ (F01–F14), packages/safety/runbooks/ (F15–F21) |
How to triage and recover per failure code |
packages/core/anti-patterns.md and packages/core/glossary.md cross-cut all four.
Engines compute. Conduct governs. Taxonomy classifies. Recipes wire.
A typical call path in production:
- Conduct says: "Before any destructive op, verify."
- Engine computes: "Trust score for this file is 0.18 (Beta-Bernoulli over recent change history)."
- Conduct decides: "Score < 0.2; this is a critical band; require explicit confirmation."
- If the agent skips the confirmation:
- Taxonomy classifies: "That was F10 destructive-without-confirmation."
- The failure log gets an entry tagged F10. Future sessions read it before the same kind of op.
No engine runs decisions; no conduct does math; no taxonomy code executes anything. The separation is the design — each layer is auditable on its own terms.
- Why not just conduct? Conduct without engines is hand-wavy — "check trust" with no formula is a heuristic. Engines give the rules teeth.
- Why separate taxonomy? Codes are reused across many conduct modules. Promoting them to their own surface lets the catalog grow independently and supports per-code drilldown that doesn't fit in a behavior module.
- Why recipes? The framework is host-agnostic by design, but adoption is host-specific. Recipes make the wiring explicit so adopters don't have to re-derive it.
- Why not include reference implementations or evals? Both are valuable. Both are also language-bound and project-specific. Keeping the core layers data/prose-only makes the framework portable; reference implementations live in adopter projects (or in a sibling repo if demand grows).
These hold at all times. Violations are bugs.
- Conduct never imports engines as decision-makers. A conduct module can reference an engine ("compute trust score X") but does not embed it ("if α/(α+β) < 0.2"). The math stays in the engine doc.
- Engines never reference conduct. Engines describe how to compute, not what to do with the result.
- Taxonomy never references project-specific paths. A failure code is project-agnostic; specific examples may name
GloborRead(universal tool primitives), but neverwixie/state/...ormyproject/scripts/.... - Recipes never invent rules. Recipes are wiring guides — they show how to load existing modules into a host. New rules belong in conduct.
Cross-refs across folders use relative paths:
- Inside the same folder:
./X.md - Within the same package:
../engines/X.md,../conduct/X.md - Across packages:
../../<other-package>/conduct/X.md(e.g.,../../web/conduct/web-fetch.md) - Never absolute paths, never repo-rooted (
/packages/core/conduct/X.md)
This keeps the repo movable. If a fork relocates the packages, only one search-and-replace fixes references.
| Adding a new… | Goes in | PR template |
|---|---|---|
| Behavior rule | packages/<area>/conduct/<name>.md (pick the package whose scope the rule belongs to) |
Justify why it doesn't fit an existing module |
| Algorithm | packages/orchestration/engines/<name>.md |
Reference paper + complexity + failure modes |
| Failure code | packages/core/taxonomy/f<NN>-<slug>.md (F01–F14) or packages/safety/taxonomy/f<NN>-<slug>.md (F15+) + index update |
3+ independent observations + testable counter |
| Host integration | packages/skills/recipes/<host>.md |
Concrete adoption steps + verification check |
| Architectural decision | docs/adr/<NNNN>-<slug>.md |
Context, decision, consequences |
Adjacent docs that describe the framework but aren't part of it (this file, packages/core/anti-patterns.md, packages/core/glossary.md) live at the repo root or in docs/.
- Conduct module names are stable. Renaming
discipline.mdwould break every adopter'sCLAUDE.md. - Taxonomy F-codes are append-only. F03 is always context-decay. F22+ is for new patterns; existing codes are not renumbered.
- Engine names can be renamed if the algorithm name is wrong. Document the rename in an ADR.
- Recipe paths are loose; recipes are the most volatile surface and may be split or merged as host platforms evolve.
- Not a runtime. Nothing here executes. The framework provides text; the runtime is the host.
- Not a complete agent system. Adopters bring their own orchestration, tool implementations, and evals.
- Not a substitute for evals. Defaults shift behavior on average; per-task evals own task quality.