Architecture

How the four moving parts of vis compose. Read once; reference when extending.

The repo is a packages-monorepo: each of the four layers below is sliced across packages/{core,safety,orchestration,web,skills,cost,memory}/ so adopters can pull just the area they need. The four-layer split is the conceptual frame; package boundaries are the physical layout.

The four parts

┌──────────────────────────────────────────────────────────────────┐
│                         vis                          │
└──────────────────────────────────────────────────────────────────┘
                                │
        ┌──────────────────┬────┴────┬──────────────────┐
        ▼                  ▼         ▼                  ▼
   ┌─────────┐        ┌─────────┐  ┌──────────┐    ┌─────────┐
   │ conduct │        │ engines │  │ taxonomy │    │ recipes │
   │ (rules) │        │ (math)  │  │ (codes)  │    │ (hosts) │
   └─────────┘        └─────────┘  └──────────┘    └─────────┘
        │                  │            │                │
        │ "what to do"     │ "what      │ "what failed"  │ "where to
        │                  │  to        │                │  drop it"
        │                  │  compute"  │                │

Layer	Where it lives	Role
conduct	`packages/{core,skills,orchestration,safety,web,memory,cost}/conduct/` (36 rule modules)	What the agent should do or avoid
engines	`packages/orchestration/engines/` (12 algorithm docs)	What the agent should compute
taxonomy	`packages/core/taxonomy/` (F01–F14 + axes.md), `packages/safety/taxonomy/` (F15–F21)	How the agent names what went wrong
recipes	`packages/skills/recipes/` (8 host guides) + `packages/cost/recipes/` (eval-harnesses)	How a project picks up the framework
runbooks	`packages/core/runbooks/` (F01–F14), `packages/safety/runbooks/` (F15–F21)	How to triage and recover per failure code

packages/core/anti-patterns.md and packages/core/glossary.md cross-cut all four.

Composition: who calls whom

Engines compute. Conduct governs. Taxonomy classifies. Recipes wire.

A typical call path in production:

Conduct says: "Before any destructive op, verify."
Engine computes: "Trust score for this file is 0.18 (Beta-Bernoulli over recent change history)."
Conduct decides: "Score < 0.2; this is a critical band; require explicit confirmation."
If the agent skips the confirmation:
Taxonomy classifies: "That was F10 destructive-without-confirmation."
The failure log gets an entry tagged F10. Future sessions read it before the same kind of op.

No engine runs decisions; no conduct does math; no taxonomy code executes anything. The separation is the design — each layer is auditable on its own terms.

Why these four (and not three or five)

Why not just conduct? Conduct without engines is hand-wavy — "check trust" with no formula is a heuristic. Engines give the rules teeth.
Why separate taxonomy? Codes are reused across many conduct modules. Promoting them to their own surface lets the catalog grow independently and supports per-code drilldown that doesn't fit in a behavior module.
Why recipes? The framework is host-agnostic by design, but adoption is host-specific. Recipes make the wiring explicit so adopters don't have to re-derive it.
Why not include reference implementations or evals? Both are valuable. Both are also language-bound and project-specific. Keeping the core layers data/prose-only makes the framework portable; reference implementations live in adopter projects (or in a sibling repo if demand grows).

Layering invariants

These hold at all times. Violations are bugs.

Conduct never imports engines as decision-makers. A conduct module can reference an engine ("compute trust score X") but does not embed it ("if α/(α+β) < 0.2"). The math stays in the engine doc.
Engines never reference conduct. Engines describe how to compute, not what to do with the result.
Taxonomy never references project-specific paths. A failure code is project-agnostic; specific examples may name Glob or Read (universal tool primitives), but never wixie/state/... or myproject/scripts/....
Recipes never invent rules. Recipes are wiring guides — they show how to load existing modules into a host. New rules belong in conduct.

Cross-references

Cross-refs across folders use relative paths:

Inside the same folder: ./X.md
Within the same package: ../engines/X.md, ../conduct/X.md
Across packages: ../../<other-package>/conduct/X.md (e.g., ../../web/conduct/web-fetch.md)
Never absolute paths, never repo-rooted (/packages/core/conduct/X.md)

This keeps the repo movable. If a fork relocates the packages, only one search-and-replace fixes references.

Extending

Adding a new…	Goes in	PR template
Behavior rule	`packages/<area>/conduct/<name>.md` (pick the package whose scope the rule belongs to)	Justify why it doesn't fit an existing module
Algorithm	`packages/orchestration/engines/<name>.md`	Reference paper + complexity + failure modes
Failure code	`packages/core/taxonomy/f<NN>-<slug>.md` (F01–F14) or `packages/safety/taxonomy/f<NN>-<slug>.md` (F15+) + index update	3+ independent observations + testable counter
Host integration	`packages/skills/recipes/<host>.md`	Concrete adoption steps + verification check
Architectural decision	`docs/adr/<NNNN>-<slug>.md`	Context, decision, consequences

Adjacent docs that describe the framework but aren't part of it (this file, packages/core/anti-patterns.md, packages/core/glossary.md) live at the repo root or in docs/.

Stability commitments

Conduct module names are stable. Renaming discipline.md would break every adopter's CLAUDE.md.
Taxonomy F-codes are append-only. F03 is always context-decay. F22+ is for new patterns; existing codes are not renumbered.
Engine names can be renamed if the algorithm name is wrong. Document the rename in an ADR.
Recipe paths are loose; recipes are the most volatile surface and may be split or merged as host platforms evolve.

What this architecture is not

Not a runtime. Nothing here executes. The framework provides text; the runtime is the host.
Not a complete agent system. Adopters bring their own orchestration, tool implementations, and evals.
Not a substitute for evals. Defaults shift behavior on average; per-task evals own task quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

The four parts

Composition: who calls whom

Why these four (and not three or five)

Layering invariants

Cross-references

Extending

Stability commitments

What this architecture is not

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Architecture

The four parts

Composition: who calls whom

Why these four (and not three or five)

Layering invariants

Cross-references

Extending

Stability commitments

What this architecture is not