-
Notifications
You must be signed in to change notification settings - Fork 4.4k
14.1 Intentional Architecture
Source: #5574 Ā· Part of the Maturity Framework. This page mirrors the RFC body; discussion lives on the issue.
A note to the team before you read this.
This document was written to help us move from a codebase that grew reactively into one that is built with intention. If some of the concepts here are new to you, that is not a problem ā it means this document is doing its job. Every senior engineer you will ever work with has learned these lessons the hard way, on a codebase that got too big to understand. We have the rare opportunity to recognize the pattern early and course-correct before it becomes painful. This is a good thing. Take your time with it.
- A Development Philosophy: Vision First
- The Vision ā What ZeroClaw Is
- Honest Assessment: Where We Are Today
- The Target Architecture
- Standards We Should Adopt
- Phased Roadmap: v0.7.0 ā v1.0.0
- Code and Complexity Metrics
- What This Means for Contributors
- Discussion Questions for the Team
| Rev | Date | Summary |
|---|---|---|
| 1 | 2026-04-09 | Initial draft |
| 2 | 2026-04-09 | Added §4.4.1 Versioning Policy (unified workspace inheritance, stability tiers, product-level breaking change definition); added §4.4.2 Release Artifacts (feature flag fate, canonical release binary profile, release artifact matrix); added Discussion Questions for versioning strategy and observability defaults |
| 3 | 2026-04-10 | Terminology correction per implementation feedback from PR #5559: "kernel" ā "runtime" for the agent orchestration layer throughout; "kernel" now refers specifically to the irreducible foundation (--no-default-features build); §4.1 updated to describe the explicit two-layer architecture (foundation + runtime); §4.2ā§4.3 dependency diagram and component map updated to show zeroclaw-runtime; Phase 2 renamed from "The Kernel" to "The Runtime"; binary size targets reframed as aspirational north stars with measured progress tracking rather than hard gates; §7 updated with actual Phase 1 measurement (6.6 MB foundation build) and explicit note that architectural decomposition enables optimization but optimization is a dedicated second pass |
Every decision we make in software ā what to build, how to build it, what to skip ā should flow downward from a hierarchy of intent:
Vision
āāā Architecture
āāā Design
āāā Implementation
āāā Testing
āāā Documentation
āāā Release
This is not a waterfall process. It is a decision hierarchy. It means that when you are writing a function, you should be able to trace a straight line upward: this function exists because of this design decision, which exists because of this architectural choice, which exists because of this vision. If you cannot draw that line, the code probably should not exist.
What each layer means in practice:
| Layer | The Question It Answers | What Goes Wrong Without It |
|---|---|---|
| Vision | Why does this project exist? Who is it for? What does success look like? | You build things nobody needs, or contradict yourself across releases |
| Architecture | What are the structural decisions that make the vision possible? | You end up with a "Big Ball of Mud" ā code that works but cannot be changed without breaking something else |
| Design | How do the components relate? What are the interfaces between them? | You get tight coupling ā components that know too much about each other's internals |
| Implementation | How do we build this specific component? | Bugs, performance issues, security holes |
| Testing | Does the implementation match the design? Does the design serve the architecture? | You ship broken things and don't know why |
| Documentation | How do we transfer this knowledge to the next person? | Every contributor has to rediscover everything from scratch |
| Release | How do we get this to users safely and sustainably? | Users get broken or confusing software |
ZeroClaw was bootstrapped by AI tools working from OpenClaw's TypeScript codebase. AI code generation works at the Implementation layer. It writes functions, structs, and modules that do things. It does not set Vision. It does not make Architecture decisions. It does not define Design contracts.
The result is a codebase that is impressively functional but architecturally accidental. The code does what it needs to do today, but it was not designed ā it accumulated. This pattern has a name in our industry: the Big Ball of Mud. It is the most common architecture in software, not because anyone chose it, but because it is what you get when you skip the top of the hierarchy.
This RFC is our chance to fix that ā not by throwing away what works, but by growing an intentional architecture around it using a technique called the Strangler Fig Pattern: we build the new structure around the edges of the old one, migrating inward over time, until the old structure is gone. No "big bang" rewrite. No throwing away working code. Just steady, intentional improvement.
Before we talk about architecture, we need to be precise about what we are building. This is the Vision layer. Everything that follows must serve this.
ZeroClaw is a personal AI assistant runtime that any person can run on any hardware ā from a $10 embedded board to a cloud server ā with zero configuration overhead, zero external service requirements, and zero compromise on capability or security.
Breaking that down into concrete commitments:
Zero overhead. The core agent starts in milliseconds and uses less memory than a browser tab. This is not a marketing claim ā it is an architectural constraint. Every decision we make must be tested against it.
Zero external requirements. A user who downloads ZeroClaw and has an LLM provider configured should have a working, useful AI assistant without installing anything else. Channels, dashboards, and integrations are things you add when you want them ā not things you need before it works.
Zero compromise. Lean does not mean weak. ZeroClaw must have a serious security model, real observability, and genuine extensibility. The tension between "small binary" and "full capability" is resolved through composition: a small core, extended by components you choose.
For every skill level. A student on a $10 Raspberry Pi and a team running a production deployment should both feel like ZeroClaw was designed for them. This means the default experience must be simple, and the advanced experience must be powerful ā not two different products.
User-owned. Your data, your hardware, your configuration. ZeroClaw does not require an account, does not phone home, and does not lock you into a platform.
This section is not criticism of anyone's work. It is a diagnosis, and you cannot fix what you do not name.
The entire ZeroClaw codebase currently lives in a single Rust crate. This means:
- A Telegram channel and the core agent loop are compiled from the same source tree whether you use Telegram or not
- The web dashboard (a full React application) is embedded in the binary using
rust-embed, making every binary include the web UI even for users who only ever use the CLI - The gateway HTTP server contains webhook handlers for WhatsApp, WATI, Linq, Nextcloud Talk, and Gmail ā meaning specific channel integrations are baked into the web server
- Every one of the 70+ tools is compiled into the binary, regardless of which tools a user will ever call
- The only mechanism for excluding code is a Cargo feature flag, which requires users to have a Rust development environment and recompile from source
The consequence for users: The stated goal is a lean binary for $10 hardware. But the binary ships with code for 27 messaging channels, 70+ tools, a full web server, a React application, and integrations with Jira, Notion, Google Workspace, LinkedIn, and more ā most of which any given user will never touch.
The consequence for contributors: When a file is 9,500 lines long, it is not possible to understand it. When every feature is in one crate, touching anything risks breaking everything.
These are measured facts from the current codebase, not estimates:
| File | Lines | What It Does | What It Should Do |
|---|---|---|---|
src/agent/loop_.rs |
~9,500 | Tool call parsing, streaming, history, cost tracking, model routing, memory, credential scrubbing, context building | Orchestrate a single agent turn |
src/gateway/mod.rs |
~2,260 | Web server + React app server + WhatsApp webhooks + WATI webhooks + Linq webhooks + Nextcloud webhooks + Gmail webhooks + pairing + rate limiting + WebAuthn | Serve the web dashboard API |
src/providers/mod.rs |
~3,750 | Factory + 40+ provider implementations + OAuth flows + credential resolution + error scrubbing | Route to a provider |
src/tools/mod.rs |
all_tools_with_runtime() at L387āL1066 |
Instantiate all 70+ tools unconditionally | Register the tools the user configured |
A 9,500-line file is not a module. It is a monolith that happens to have a .rs extension.
This diagnosis should not obscure what is genuinely well-designed:
-
The trait layer is excellent.
Provider,Channel,Tool,Memory,Observer,RuntimeAdapter, andPeripheralare clean, well-documented Rust traits. These are the right seams. The problem is they do not correspond to crate boundaries, so the compiler cannot enforce the layering. -
The WASM plugin system is partially built.
PluginHost,WasmTool,WasmChannel,PluginManifest, and Ed25519 signature verification all exist insrc/plugins/. The execution bridge is a stub, but the structure is correct. -
The observability system is mature. OpenTelemetry, Prometheus, and DORA metrics are all implemented against a clean
Observertrait. This is production-quality work. - The security model is thoughtful. Pairing codes, autonomy levels, sandboxing, and policy enforcement show real design intent.
We are not rewriting ZeroClaw. We are giving its existing good ideas a structure they can grow in.
A microkernel architecture separates a minimal, stable core from optional subsystems that extend it. In operating systems, the classic example is a kernel that only handles memory and scheduling, with everything else ā filesystems, device drivers, network stacks ā running as separate processes that communicate through a well-defined interface.
For an AI agent runtime, the mapping reveals two distinct internal layers that the OS analogy conflates:
| OS Microkernel Concept | ZeroClaw Equivalent |
|---|---|
| Kernel |
Foundation layer ā API traits, config, providers, memory backends, infra, tool-call parser. The irreducible core: builds with --no-default-features. Can exchange messages with an LLM and store memory. Nothing more. |
| Init / runtime system |
Agent runtime layer ā Orchestration loop, security policy enforcement, plugin host, core tools, IPC API. The zeroclaw-runtime crate, gated by the agent-runtime feature. This is what makes ZeroClaw an agent, not just a library. |
| IPC | Local socket / IPC API between the runtime and external components |
| Device drivers | Channel plugins (Telegram, Discord, etc.) |
| Filesystem drivers | Memory backend plugins (SQLite, Markdown) |
| User processes | Gateway binary, Tauri desktop app |
The distinction matters: the foundation is the minimum that must exist for any ZeroClaw binary to function. The runtime is the minimum that must exist for it to function as an agent. Everything else is composed in.
This two-layer split was identified during the Phase 1 workspace decomposition (PR #5559) and is reflected in the crate naming: zeroclaw-runtime (the crate) is gated by agent-runtime (the feature). The earlier revisions of this RFC used "kernel" loosely to refer to what is now correctly named the runtime layer. This revision corrects that terminology throughout.
The most important architectural rule in this design ā the one that, if broken, collapses the whole structure ā is this:
Dependencies flow inward. The runtime knows nothing about the plugins. Plugins know about the API. Nothing knows about everything.
zeroclaw-api ā defines all traits (Provider, Channel, Tool, ...)
ā² no implementations, no heavy dependencies
ā depends on
foundation crates ā zeroclaw-config, zeroclaw-providers, zeroclaw-memory,
ā² zeroclaw-infra, zeroclaw-tool-call-parser
ā depends on all depend on zeroclaw-api; no cross-dependencies
zeroclaw-runtime ā implements the agent loop (agent-runtime feature)
ā² depends on zeroclaw-api + foundation crates
ā depends on knows nothing about specific channels or tools
plugin crates ā zeroclaw-channel-discord, zeroclaw-tools-web, ...
ā² depend on zeroclaw-api (not the runtime)
ā depends on
zeroclaw binary ā thin wiring layer
reads config, registers plugins, starts runtime
If zeroclaw-runtime ever imports TelegramChannel, the architecture has been violated. The compiler will enforce this once crate boundaries are drawn.
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā zeroclaw (binary crate) ā
ā Reads config ā registers only configured components ā starts ā
ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā zeroclaw-runtime (agent-runtime feature) ā ā
ā ā ā ā
ā ā Agent Loop Ā· CLI Channel Ā· Security Policy ā ā
ā ā Plugin Host Ā· Local IPC API ā ā
ā ā Core Tools: shell, file, git, memory recall/store ā ā
ā ā ā ā
ā ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā ā
ā ā ā Foundation (--no-default-features) ā ā ā
ā ā ā ā ā ā
ā ā ā zeroclaw-api Ā· zeroclaw-config Ā· zeroclaw-infra ā ā ā
ā ā ā zeroclaw-providers Ā· zeroclaw-memory ā ā ā
ā ā ā zeroclaw-tool-call-parser ā ā ā
ā ā ā ā ā ā
ā ā ā Vision target: <5 MB RAM at runtime ā ā ā
ā ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā² ā
ā zeroclaw-api (traits only) ā
ā ā² ā
ā āāāāāāāāāāāāāāāā āāāāāāāāāā“āāāāāāāāā āāāāāāāāāāāāāāāāāāāāāāā ā
ā ā zeroclaw-gw ā ā Channel pluginsā ā Tool plugins ā ā
ā ā (opt-in ā ā ā ā ā ā
ā ā binary) ā ā channel-discordā ā tools-web ā ā
ā ā ā ā channel-slack ā ā tools-integrations ā ā
ā ā HTTP/WS/SSE ā ā channel-tg ā ā tools-hardware ā ā
ā ā Web UI ā ā channel-email ā ā tools-mcp ā ā
ā ā REST API ā ā ... ā ā ... ā ā
ā āāāāāāāā¬āāāāāāāā āāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāā ā
ā ā zeroclaw-desktopā ā Tauri app (already exists in apps/tauri) ā
ā ā System tray app ā bundles zeroclaw-gw as a sidecar ā
ā ā Native GUI ā ā
ā āāāāāāāāāāāāāāāāāāā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
The architecture enables a clean distribution story that requires no Rust toolchain from end users:
| User wants | What they download | What zeroclaw onboard does |
|---|---|---|
| CLI only |
zeroclaw runtime binary |
Configure provider, done |
| CLI + Discord |
zeroclaw runtime binary |
Download + install channel-discord.wasm
|
| Local web UI |
zeroclaw + zeroclaw-gw
|
Configure both, open browser |
| Desktop app |
zeroclaw-desktop installer |
Bundles runtime + gateway + UI |
| Everything |
zeroclaw-desktop or zeroclaw --profile full
|
Downloads all plugins |
The zeroclaw plugin install command (backed by PluginHost, which already exists) becomes the package manager. The zeroclaw onboard wizard integrates it so non-technical users never see cargo.
As ZeroClaw transitions from a single crate to a multi-crate workspace, two concerns must be kept separate from the start:
-
The product version ā what
zeroclaw --versionreports, what GitHub Releases, changelogs, and package managers (Homebrew, apt, cargo-binstall) track. This is the version operators and users reason about. - Component stability ā how mature and reliable a given component is. A single version number cannot carry this signal on its own.
These are orthogonal. Conflating them creates misleading semver noise and erodes trust in the version number. This policy defines both.
Crate versioning: unified with intentional exceptions
All application crates ā the kernel, the gateway, tool plugin crates, channel plugin crates, and the CLI ā use Cargo workspace package inheritance:
# Cargo.toml (workspace root)
[workspace.package]
version = "0.8.0"
# crates/zeroclaw-kernel/Cargo.toml
[package]
version.workspace = trueA single version in the root Cargo.toml is the authoritative product version. This is the right model because:
- Users, operators, and packagers deal with one version, not twelve
- Release automation via
release-plzis straightforward: one PR, one bump, one changelog entry - It reflects ZeroClaw's identity as a product, not a library ecosystem
- The WIT interface version ā not the Rust crate version ā is the actual plugin ABI contract (see §5.2)
Three crate classes are intentionally excluded from workspace inheritance and maintain independent versions on their own cadence:
| Crate | Reason for independence |
|---|---|
zeroclaw-api |
Starts at 0.1.0; its 1.0.0 release is a formal milestone deliverable of v1.0.0, signalling a stable Rust trait surface for plugin SDK authors |
aardvark-sys, zeroclaw-robot-kit
|
Hardware library crates with their own user audiences and maintenance cadences; not application components |
WIT interface files (wit/*.wit) |
Versioned via @since and @unstable annotations per the WASI component model spec; these are the primary plugin ABI contract and are independent of Cargo semver entirely |
What "breaking" means for the product version
Because application crates share a unified version, the team needs a product-level definition of a breaking change ā distinct from a breaking change inside a single crate's internal implementation. A breaking change within a plugin crate that does not cross any of the boundaries below is not a product-level breaking change and does not warrant a MAJOR bump.
| Bump | Warranted when |
|---|---|
| MAJOR | WIT interface changes incompatibly (existing plugins must recompile); kernel IPC API changes incompatibly (gateway or external clients break); config file schema requires a migration; CLI commands or flags are removed or renamed |
| MINOR | New capabilities anywhere in the workspace; new plugins available in the registry; new stable APIs; stability tier promotions; deprecation announcements (not removals) |
| PATCH | Bug fixes; security patches; documentation corrections; no new capabilities and no deprecations |
Stability tiers
The product version answers "what release is this?" A stability tier answers "how much can I rely on this component?" Every component ā kernel, gateway, plugin crate, WIT interface ā carries one of three tiers. Tiers are documented in the component's AGENTS.md and in its plugin registry manifest.
| Tier | Meaning | Implication |
|---|---|---|
| Stable | Covered by the product's breaking-change policy. No breaking changes without a MAJOR version bump and a published migration guide. | Kernel (target: v0.8.0), zeroclaw-api WIT interface (target: v0.9.0), kernel IPC API (target: v1.0.0) |
| Beta | Functional and tested. Breaking changes are permitted in MINOR releases but are announced in the changelog with upgrade notes. |
zeroclaw-gw (v0.9.0 ā v1.0.0), mature channel and tool plugins |
| Experimental | No stability guarantee. May break in PATCH releases. Must be clearly marked as experimental in docs and plugin registry manifests. |
New tool integrations, new channel implementations, early hardware plugins |
Stability tiers are promoted, never demoted through a deliberate team decision. Promotions are recorded in the changelog and, for architectural components, in an ADR. A component must hold its current tier for at least one full release cycle before promotion is considered.
Release automation
Releases use release-plz, which opens a release PR on push to master, bumps the workspace version, and generates a changelog from conventional commit titles. release-plz natively understands workspace inheritance and handles the crate publication order automatically. Crates with independent versions (zeroclaw-api, hardware library crates) are managed separately using the same tool's per-crate configuration.
The microkernel transition changes the fundamental nature of the question "which features are compiled in?" Today that question has one answer: whatever feature flags you passed to cargo build. After the transition it splits into two separate concerns:
- What is in the kernel binary ā fixed at compile time, determined per platform, published to GitHub Releases
-
What capabilities are available ā determined at runtime by which plugins are installed via
zeroclaw plugin install
These are no longer the same question, and the current [features] section of Cargo.toml must be interpreted through that lens.
Fate of the current compile-time feature flags
The 20+ feature flags in the current Cargo.toml fall into three buckets as the architecture matures:
| Bucket | Flags | Outcome |
|---|---|---|
| Retire ā plugin |
channel-nostr, channel-matrix, channel-lark, channel-feishu, whatsapp-web, browser-native
|
Removed from the kernel. Each becomes a WASM plugin crate published to the plugin registry. No compile-time decision required. |
| Always-on |
plugins-wasm, skill-creation
|
Compiled into every kernel binary unconditionally. plugins-wasm is the kernel's core mechanism; skill-creation is a zero-overhead code path. Neither belongs behind a flag. |
| Stay ā platform/infrastructure flag |
peripheral-rpi, hardware, sandbox-landlock, sandbox-bubblewrap, voice-wake, probe
|
Remain as compile-time flags because they require native library linking or OS-level access that cannot be provided by a WASM plugin. peripheral-rpi and hardware appear only in platform-specific release targets. |
Two flags require a deliberate team decision before the v0.8.0 release and are surfaced here rather than resolved unilaterally:
-
observability-prometheusā currently indefault. Prometheus metrics add measurable binary size overhead. The question is whether a production runtime should ship observability on by default, or whether operators opt in. Recommendation: keep indefaultfor the standard release; operators on severely size-constrained targets can build with--no-default-features. -
observability-otelā OTLP export carries a larger dependency footprint (opentelemetry + reqwest blocking client). Recommendation: remains opt-in, not indefault. Production deployments that need trace export enable it explicitly.
The ci-all meta-feature simplifies substantially as channel and tool flags retire. By v1.0.0 it covers only the remaining platform and infrastructure flags.
The canonical release kernel binary
The binary published to GitHub Releases for each platform target is built with the following profile:
| Compiled in | Not compiled in |
|---|---|
| Core agent loop | Any channel implementation |
| 10ā12 core tools (see Phase 2 D2) | Any non-core tool |
| SQLite + Markdown memory backends | Browser automation |
Plugin host (plugins-wasm, always-on) |
observability-otel (operator opt-in) |
observability-prometheus |
voice-wake (libasound2 dependency) |
skill-creation (zero-overhead) |
probe (niche hardware debugging) |
| IPC server | Web assets (moved to zeroclaw-gw) |
| Platform sandbox where supported |
peripheral-rpi (separate hardware build) |
There is no longer a "build with everything" binary. That mental model is replaced by zeroclaw plugin install --profile full, which downloads the full plugin catalog after installing the lean kernel binary.
Release artifact matrix
Each GitHub Release publishes the following artifacts:
| Artifact | Targets | Notes |
|---|---|---|
zeroclaw kernel binary |
x86_64-unknown-linux-musl, aarch64-unknown-linux-gnu, armv7-unknown-linux-gnueabihf, x86_64-apple-darwin, aarch64-apple-darwin, x86_64-pc-windows-msvc
|
Static musl build for Linux x86_64; GNU for ARM targets |
zeroclaw kernel binary (hardware) |
aarch64-unknown-linux-gnu, armv7-unknown-linux-gnueabihf
|
Same targets, compiled with peripheral-rpi and hardware flags for Raspberry Pi deployments |
zeroclaw-gw gateway binary |
Same platform matrix as kernel | Published alongside the kernel; users install separately |
| WASM plugin files | wasm32-wasip1 |
Published to the plugin registry (not GitHub Releases); installable via zeroclaw plugin install
|
zeroclaw-desktop installer |
x86_64 and aarch64 for macOS, Windows, Linux (AppImage/deb) |
Bundles kernel + gateway + full plugin set; built by the Tauri workflow |
The wasm32-wasip1 plugin builds run in a separate CI job and are published to the plugin registry on their own cadence. A plugin release does not require a kernel release.
The current gateway conflates two things that must be separated:
Current (wrong):
zeroclaw binary
āāā gateway
āāā Web UI server (serves React app)
āāā REST/WS/SSE API
āāā WhatsApp webhook handler ā this is a channel, not a web server
āāā WATI webhook handler ā this is a channel, not a web server
āāā Linq webhook handler ā this is a channel, not a web server
āāā Nextcloud webhook handler ā this is a channel, not a web server
āāā Gmail push handler ā this is a channel, not a web server
Target (correct):
zeroclaw-kernel
āāā Local IPC API (Unix socket / 127.x HTTP)
zeroclaw-gw (separate binary, optional)
āāā Connects to kernel IPC API
āāā Web UI server
āāā REST/WS/SSE API
āāā Generic webhook proxy ā routes to channel plugins
channel-whatsapp.wasm
āāā Registers its own webhook route with the gateway
āāā Handles WhatsApp-specific message parsing
Why this matters: When the gateway is a separate process, it can crash, restart, or be absent without affecting the agent. The kernel keeps running. This is especially important for the edge hardware use case ā a Raspberry Pi running the kernel can have its web UI served from a VPS, with the kernel connecting outbound via a channel plugin. No inbound firewall rules needed.
Standards are agreements that have been made by many smart people over many years. Adopting them means we get those years of thinking for free, and it means our software integrates naturally with the rest of the ecosystem. Here are the ones that apply directly to ZeroClaw.
What it is: OpenTelemetry (OTel) is the industry standard for collecting traces, metrics, and logs from software systems. It is maintained by the Cloud Native Computing Foundation and supported by every major cloud provider and monitoring tool.
Why it matters for ZeroClaw: We have already implemented OtelObserver against our Observer trait. We have Prometheus metrics and DORA metrics. The issue is that these are not yet standardized across the codebase ā some modules log with tracing::info!, others emit ObserverEvents, and the two are not connected.
What we should do:
- Adopt OpenTelemetry as the single observability interface for all components
- Ensure every plugin emits OTel spans when it executes, so a user can see a full trace from "message received on Discord" through "agent called shell tool" to "response sent"
- Adopt W3C Trace Context (
traceparent/tracestateheaders) for propagating trace IDs across the kernel ā gateway ā plugin boundary - Structured log output should be JSON when
ZEROCLAW_LOG_FORMAT=jsonis set (already using thetracingcrate, just needs a JSON subscriber)
Standards: OpenTelemetry specification Ā· W3C Trace Context (REC) Ā· RFC 5424 (Syslog, for system log integration)
What it is: WASI (WebAssembly System Interface) is the standard API that WebAssembly modules use to interact with the host system. WIT (WebAssembly Interface Types) is the interface definition language for describing what a WASM component exports and imports ā think of it as a .proto file but for WASM plugins.
Why it matters for ZeroClaw: Our WasmTool and WasmChannel bridges currently have no formal contract for what a plugin WASM binary must export. This means a plugin author has to guess. WIT files define that contract precisely and enable automatic code generation for plugin authors in any language.
What we should do:
- Define WIT interface files for
Tool,Channel, andMemoryplugin types (awit/directory at the root of the workspace) - Use
wit-bindgento generate the Rust host-side bindings from those WIT files - Document the WIT interfaces as the official plugin SDK
- A plugin author writes Rust (or Go, or C, or Python) against the WIT interface and
cargo build --target wasm32-wasiā the result drops into~/.zeroclaw/plugins/
Standards: WASI 0.2 Ā· W3C WebAssembly Component Model Ā· WIT IDL
What it is: OpenAPI is the standard for describing HTTP APIs. Version 3.1 aligns with JSON Schema Draft 2020-12.
Why it matters for ZeroClaw: The kernel's local IPC API (the socket that the gateway and other components connect to) needs a stable, documented contract. Without a formal spec, the gateway and kernel will drift apart silently over time.
What we should do:
- Write an OpenAPI 3.1 spec for the kernel's local IPC API before implementing it
- Generate the Rust server stubs from the spec using
utoipaoraide - Publish the spec as
docs/reference/api/kernel-ipc-api.yaml - The gateway's external API should also have an OpenAPI spec
Standards: OpenAPI 3.1 Ā· JSON Schema Draft 2020-12
What it is: The OWASP Application Security Verification Standard is a checklist of security requirements organized by risk level (L1 basic, L2 standard, L3 advanced).
Why it matters for ZeroClaw: The gateway handles webhooks from external services, processes untrusted user input, and manages secrets. The pairing system, WebAuthn support, and rate limiting all exist ā but there is no framework for verifying that they are complete or correct.
What we should do:
- Target ASVS Level 2 for the gateway and security module
- Work through the Level 2 checklist and document which requirements we meet, which we partially meet, and which are out of scope
- Use this as the basis for security-related issues and PRs
Standards: OWASP ASVS 4.0 Ā· OWASP Top 10
What it is: ISO/IEC 25010 defines a model for software product quality with eight top-level characteristics: functional suitability, performance efficiency, compatibility, usability, reliability, security, maintainability, and portability.
Why it matters for ZeroClaw: When someone asks "is this good enough to merge?" the answer is currently subjective. ISO 25010 gives us a vocabulary for that conversation. The vision commitments map directly: "zero overhead" ā performance efficiency; "any hardware" ā portability; "zero compromise" ā security + reliability.
What we should do:
- Use the eight quality characteristics as a lens in PR reviews for significant changes
- Include a brief quality impact statement in the PR template for architectural changes (e.g., "This change improves maintainability by reducing coupling between the gateway and channel implementations, at no impact to performance efficiency")
Standards: ISO/IEC 25010:2023
These are already in place and should be maintained:
| Standard | Status | Where |
|---|---|---|
| Semantic Versioning 2.0.0 | ā Adopted |
Cargo.toml, releases |
| Conventional Commits | ā Adopted |
AGENTS.md, commit history |
| RFC 3339 / ISO 8601 timestamps | ā Adopted |
MemoryEntry, all timestamps |
| XDG Base Directory Specification | ā Adopted |
directories crate in use |
| Keep a Changelog | ā Adopted | CHANGELOG.md |
| Rust API Guidelines | ā Partially | Clippy config enforces many |
Each phase follows the Vision ā Architecture ā Design ā Implementation ā Testing ā Documentation ā Release hierarchy. No phase begins implementation until its design is reviewed and agreed upon.
The overall migration strategy is the Strangler Fig Pattern: we grow the new architecture around the edges of the existing code, migrating inward steadily, until the old structure is fully replaced. We never have a "stop the world" rewrite. The application is always shippable.
Theme: Make the architecture visible without changing any behavior. Draw the lines first.
Why this phase: You cannot migrate to a layered architecture until the layers exist as real boundaries. Right now, the traits define logical seams but the compiler does not enforce them ā everything is in one crate, so anything can import anything. This phase makes the seams real.
Vision alignment: None of the vision properties change for users. This is entirely internal. The value is that every future contribution now has a structural home, and new contributors can understand the codebase in parts rather than all at once.
D1: Extract zeroclaw-api crate
Create a new crate crates/zeroclaw-api containing only trait definitions and their supporting types. No implementations. No heavy dependencies. This crate should compile in under two seconds.
Move into this crate:
-
src/providers/traits.rsāProvider,ChatMessage,ChatResponse,ToolCall,StreamChunk,ProviderCapabilities -
src/channels/traits.rsāChannel,ChannelMessage,SendMessage -
src/tools/traits.rsāTool,ToolResult,ToolSpec -
src/memory/traits.rsāMemory,MemoryEntry,MemoryCategory -
src/observability/traits.rsāObserver,ObserverEvent,ObserverMetric -
src/runtime/traits.rsāRuntimeAdapter -
src/peripherals/traits.rsāPeripheral
Every other crate in the workspace that needs these types adds zeroclaw-api as a dependency. The compiler now enforces that no implementation crate can import another implementation crate without going through the API layer.
D2: Extract zeroclaw-tool-call-parser crate
The tool call parsing logic in src/agent/loop_.rs is approximately 1,400 lines of pure text transformation: it takes a string from the LLM and returns a list of structured tool calls. It has no dependency on agent state, memory, providers, or channels. It handles a dozen different LLM output formats (JSON, XML, GLM-style, MiniMax, Perl-style, markdown fences, and more).
This logic is:
- Self-contained ā perfect for its own crate
- The most fuzz-testable code in the project ā property-based tests belong here
- A genuine contribution to the Rust ecosystem ā no other crate does this comprehensively
Create crates/zeroclaw-tool-call-parser with a public API of approximately:
pub fn parse(text: &str, specs: &[ToolSpec]) -> ParseResult
pub struct ParseResult {
pub calls: Vec<ParsedToolCall>,
pub remaining_text: Option<String>,
}
pub struct ParsedToolCall {
pub name: String,
pub arguments: serde_json::Value,
pub tool_call_id: Option<String>,
}The ~300 parsing tests currently in loop_.rs move into this crate. loop_.rs shrinks by approximately 1,400 lines.
D3: Adopt OpenTelemetry as the observability standard
Formalize what is already implemented: document that ObserverEvent and ObserverMetric are the internal event bus, and that OtelObserver is the canonical production backend. Add a JSON structured logging subscriber for ZEROCLAW_LOG_FORMAT=json. Adopt W3C Trace Context for future cross-component tracing.
D4: Write WIT interface files
Before we implement WASM plugin execution, define the contracts. Create a wit/ directory at the workspace root with interface definitions for:
-
zeroclaw:tool/tool.witā the Tool plugin interface -
zeroclaw:channel/channel.witā the Channel plugin interface
These become the official plugin SDK. The implementation in v0.8.0 will be generated from these files.
-
zeroclaw-apicompiles in < 2 seconds with zero implementation dependencies -
zeroclaw-tool-call-parserhas ā„ 95% test coverage (the logic is fully testable in isolation) -
loop_.rsis under 8,000 lines - Zero user-facing behavior changes
- Zero performance regressions (benchmark suite passes)
Theme: Formalize the agent runtime as a clean, independently deployable unit. Everything that is not the runtime becomes a guest.
Why this phase: Once the seams exist (v0.7.0), we can draw the runtime boundary explicitly. This phase extracts zeroclaw-runtime as a standalone crate, completes the WASM plugin execution bridge, and wires the plugin registry client ā the mechanism by which everything outside the runtime connects to it.
Vision alignment: This is where the composition model becomes real for users. A user who wants only a CLI agent downloads one binary, runs zeroclaw onboard, and is done ā no Rust toolchain, no compilation. The zeroclaw onboard wizard gains the ability to download plugin components on demand.
D1: Formalize zeroclaw-runtime crate
Extract the agent orchestration loop, CLI channel, security policy, plugin host, and IPC API into crates/zeroclaw-runtime, gated by the agent-runtime feature. This crate depends on zeroclaw-api and the foundation crates. It has no knowledge of Telegram, Discord, Anthropic, or any specific tool implementation.
The runtime exports a clean public API:
pub struct Runtime { ... }
pub struct Registry {
pub fn register_channel(&mut self, ch: Arc<dyn Channel>);
pub fn register_tool(&mut self, t: Box<dyn Tool>);
pub fn set_provider(&mut self, p: Arc<dyn Provider>);
pub fn set_memory(&mut self, m: Arc<dyn Memory>);
pub fn set_observer(&mut self, o: Arc<dyn Observer>);
}
pub async fn run(runtime: Runtime, registry: Registry) -> anyhow::Result<()>;The binary crate becomes a thin wiring layer that reads config and calls run.
D2: Complete the WASM execution bridge
Wire Extism into WasmTool::execute and WasmChannel. The PluginHost already handles discovery and installation. The execution bridge is the missing piece. With WIT interfaces defined in v0.7.0, use wit-bindgen to generate the host-side bindings.
A complete WasmTool::execute implementation using Extism is approximately 30ā50 lines. The bulk of the work is defining the host functions that WASM plugins can call (HTTP requests, memory access, logging) within the permission model already defined in PluginPermission.
D3: Component registry client
Add a zeroclaw plugin subcommand backed by a simple registry client:
zeroclaw plugin list # list installed plugins
zeroclaw plugin search <query> # search the component registry
zeroclaw plugin install <name> # download, verify, and install a plugin
zeroclaw plugin remove <name> # remove an installed plugin
zeroclaw plugin update # update all installed plugins
The registry is a JSON index file served from a known URL (e.g., https://plugins.zeroclawlabs.ai/index.json). Each entry includes name, version, download URL, SHA-256 checksum, and the publisher's Ed25519 public key. The PluginHost signature verification already handles the security model.
D4: Integrate zeroclaw onboard with the plugin system
The onboarding wizard should ask the user which channels and integrations they want, then call PluginRegistry::install for each. No compilation required. The user downloads a binary, runs zeroclaw onboard, and has a working configured agent in under two minutes.
D5: Reduce all_tools_with_runtime to core tools only
The kernel includes exactly the tools a user needs for a useful agent with no plugins installed: shell, file_read, file_write, file_edit, git_operations, glob_search, content_search, memory_recall, memory_store, memory_forget, and web_fetch. Everything else is registered by installed plugins.
-
zeroclaw-runtimecompiles independently with no channel or tool implementation code -
zeroclaw plugin install channel-discordworks end-to-end -
zeroclaw onboardinstalls plugins without requiring a Rust toolchain - Runtime binary size is tracked and reported in the release notes; the aspiration is downward progress toward the vision target (see §7)
- A WASM tool plugin written in Rust using the WIT interface executes correctly
Theme: Separate the web surface from the agent core.
Why this phase: The gateway is currently the largest structural coupling in the codebase. It embeds a compiled React application, handles channel-specific webhook logic, and is compiled into every binary ā including binaries intended for $10 edge hardware that will never serve a web page.
Vision alignment: This phase delivers the "zero external requirements" promise fully. A user on a Raspberry Pi gets a kernel binary with no web server, no React app, and no HTTP listener. A user who wants the web dashboard installs zeroclaw-gw separately.
D1: Define the kernel IPC API
Before extracting the gateway, define the OpenAPI 3.1 spec for the local API the kernel exposes on a Unix socket or loopback port. This API is what the gateway, the Tauri app, and any future client connects to. It is the stable contract between the kernel and the outside world.
Endpoints include: send a message, receive a streaming response, list active sessions, list installed plugins, get agent status, manage memory, trigger cron jobs. This is a design document first ā the spec should be reviewed and agreed upon before a line of implementation is written.
D2: Implement the kernel IPC server
Add the IPC server to zeroclaw-kernel behind a feature flag (--features ipc). On platforms that support it, the kernel listens on a Unix socket at ~/.zeroclaw/kernel.sock. On Windows, use a named pipe. The zeroclaw gateway command (the current entrypoint for the web server) becomes zeroclaw-gw connecting to this socket.
D3: Extract zeroclaw-gw as a separate binary
Move src/gateway/ to a new crates/zeroclaw-gw/ crate with its own binary. It depends on zeroclaw-api and connects to the kernel via the IPC API. The embedded React application via rust-embed moves entirely into this crate ā the kernel binary no longer contains any web assets.
D4: Migrate channel webhook handlers out of the gateway
The WhatsApp, WATI, Linq, Nextcloud Talk, and Gmail webhook handlers currently in gateway/mod.rs move to their respective channel plugins. The gateway provides a generic webhook registration API: a channel plugin, when loaded, registers its webhook path prefix and its handler function. The gateway routes incoming webhooks to the registered handler. The gateway no longer knows about WhatsApp.
D5: Formalize the Tauri sidecar relationship
Update apps/tauri/ to bundle zeroclaw-gw as a Tauri sidecar binary. The Tauri app becomes the "full experience" distribution: it starts the kernel and gateway automatically and opens the web UI. Users who download the Tauri app get everything working without touching a terminal.
- Kernel binary (release) does not contain any web assets or HTTP server code
-
zeroclaw-gwstarts, connects to the kernel via IPC, and serves the web dashboard - Removing
zeroclaw-gwdoes not break the kernel or any channel plugins - WhatsApp, WATI, Linq, Nextcloud Talk, and Gmail channel code has moved to plugin crates
- Tauri desktop app bundles and starts both binaries correctly
Theme: ZeroClaw becomes a composable platform, not a monolithic application.
Why this phase: With the kernel stable, the gateway separate, and the plugin system working, v1.0.0 is the release where the architecture becomes the product. External developers can write and publish plugins. Users can assemble exactly the ZeroClaw they want. The binary can credibly claim the lean profile the vision promises.
D1: Migrate all remaining channels to plugins
Each of the 27+ channel implementations becomes a standalone WASM plugin crate. They are published to the component registry with signed releases. The kernel binary contains zero channel implementations except the CLI.
D2: Migrate long-tail tools to plugins
Approximately 60 of the 70+ tools move to plugin crates, grouped by domain: zeroclaw-tools-web (browser, search, screenshot, PDF), zeroclaw-tools-integrations (Jira, Notion, Google Workspace, MS365, LinkedIn), zeroclaw-tools-hardware (board info, GPIO), zeroclaw-tools-cloud (cloud ops, security ops). The kernel retains only the 10ā12 core tools identified in v0.8.0.
D3: Plugin SDK and developer documentation
Publish a plugin development guide. A developer should be able to write a new tool plugin in an afternoon:
- Add
zeroclaw-plugin-sdkas a dependency - Implement the WIT-generated trait
cargo build --target wasm32-wasizeroclaw plugin install ./my-plugin/
The SDK handles the host function bindings, the manifest format, and the permissions model.
D4: Stabilize the kernel IPC API at v1.0
The kernel IPC API gets a version prefix (/v1/) and a stability guarantee. Breaking changes in v1.x are not permitted to this API. This is the contract that third-party clients and the gateway depend on.
D5: Extract the versioning policy and stability tier definitions to docs/contributing/stability-tiers.md
The versioning policy and stability tier table defined in §4.4.1 of this RFC become a standing contributor reference document at docs/contributing/stability-tiers.md. This document is the day-to-day reference contributors use when assigning a tier to a new plugin crate, and that maintainers consult when making release decisions. The RFC itself remains the historical record of why these decisions were made; the extracted document is what contributors look up.
- Runtime binary size is tracked against the vision target (see §7); a dedicated optimization pass through each crate is expected as a v1.0.0 workstream
- A third-party developer can publish a working plugin using only public documentation
- All 27+ channel implementations are available as downloadable plugins in the registry
-
zeroclaw onboardcompletes a full setup in under 2 minutes on a Raspberry Pi Zero 2W with no Rust toolchain installed - The full plugin catalog is installable with
zeroclaw plugin install --profile full
These are estimates based on direct code analysis of the current codebase. They are meant to give a sense of scale, not to be exact predictions.
| What moves | Approximate lines | Destination |
|---|---|---|
Tool call parser (from loop_.rs) |
~1,400 |
zeroclaw-tool-call-parser crate |
| 60+ non-core tool implementations | ~30,000 | Plugin crates |
| 24+ non-core channel implementations | ~7,200 | Plugin crates |
| Gateway HTTP server | ~2,260 |
zeroclaw-gw crate |
| Embedded React app (binary weight) | ā |
zeroclaw-gw crate |
| Channel webhook handlers from gateway | ~500 | Channel plugin crates |
| Estimated total removed from runtime | ~41,000 lines | ā |
| File | Current lines | Target after migration | Reduction |
|---|---|---|---|
src/agent/loop_.rs |
~9,500 | ~5,000 | ~47% |
src/gateway/mod.rs |
~2,260 | Moves to zeroclaw-gw
|
100% |
src/tools/mod.rs |
all_tools_with_runtime is ~680 lines |
~80 lines (core tools only) | ~88% |
src/providers/mod.rs |
~3,750 | ~1,200 (providers self-register) | ~68% |
src/channels/mod.rs |
~200 + 44 channel files | CLI channel only | ~90% |
The project's vision is expressed in runtime terms: <5 MB RAM on $10 hardware. Binary size on disk and runtime memory footprint (RSS) are related but not identical ā demand paging means only executed code paths are resident. Both are tracked.
Two-pass model: Architectural decomposition (Phases 1ā3) and binary size optimization are separate workstreams. Decomposition enables optimization by isolating dependencies to their owning crates. Maximizing efficiency crate-by-crate is the expected second pass, not a deliverable of the structural work itself.
| Configuration | Pre-decomposition (v0.6.x) | Phase 1 result (v0.7.0) | Vision target |
|---|---|---|---|
| Full monolith binary | ~8.8 MB | N/A (replaced by plugin model) | N/A |
Foundation only (--no-default-features) |
N/A | 6.6 MB (measured, stripped) | TBD after optimization pass |
Runtime binary (foundation + agent-runtime) |
N/A | tracked ā | aspiration: ⤠5 MB RAM at runtime |
| Runtime + gateway | N/A | tracked ā | ~5ā7 MB on disk |
| Runtime + gateway + top 5 channels | N/A | tracked ā | ~8ā10 MB (plugins are separate files) |
| Tauri desktop app (bundles all) | N/A | tracked ā | ~20ā25 MB installer |
The 6.6 MB Phase 1 foundation build represents real progress from the 8.8 MB monolith and proves the decomposition is working. Reaching the vision target requires a dedicated dependency-audit and optimization pass through each crate after the structural decomposition is complete ā reviewing each crate's Cargo.toml for unnecessary or over-featured dependencies, validating LTO and strip profiles, and auditing which tokio/serde feature flags are actually needed.
The key structural shift: binary size stops being a function of "features compiled in at build time" and becomes a function of "plugins installed at runtime," which the user controls. That shift is the architectural goal of Phases 1ā3. The size numbers are the optimization goal of the pass that follows.
Currently, a full cargo build --release on this codebase compiles every channel, every tool, every provider, and the embedded React app in a single compilation unit. Crate decomposition means:
- The kernel compiles independently and its compiled output is cached
- A change to
channel-discorddoes not recompile the kernel - Contributors working on a plugin only recompile their plugin
- CI can parallelize crate compilation across jobs
Estimated wall-clock time improvement for incremental builds: 60ā75% reduction for changes that do not touch the kernel.
The most common complaint from new contributors to large codebases is: "I don't know where to start." With the current architecture, the answer to "where does a Discord message go?" requires tracing through channels/discord.rs ā channels/mod.rs ā gateway/mod.rs ā agent/loop_.rs ā dozens of other files.
With the microkernel architecture, the answer is: "it goes to the kernel's Channel receiver, via the channel-discord plugin." A new contributor can understand the Discord channel completely by reading one plugin crate. They can understand the full agent loop by reading zeroclaw-kernel without any channel or tool code in scope.
A good rule of thumb for new contributors: if you can describe your change in one sentence without mentioning more than one component, you are working at the right level. "Fix a bug in how the Discord channel handles thread replies" is one component. "Refactor the agent loop and update the Discord channel and also fix the memory backend" is three components ā it should be three PRs.
Every bug report will have a clear home. "The agent is calling tools incorrectly" ā zeroclaw-tool-call-parser or zeroclaw-runtime. "The Discord integration is broken" ā channel-discord plugin. "The web dashboard is not loading" ā zeroclaw-gw. Right now, any of those bugs could be anywhere in 50,000+ lines.
The plugin model means channels and tools can have independent release cycles. A bug fix in the Telegram channel does not require a new kernel release. The kernel's stability becomes the foundation that everything else builds on. Rapid iteration on plugins does not risk kernel stability.
A published WIT interface and plugin SDK means anyone can extend ZeroClaw without forking it. A company that needs a specific integration can write a plugin against the public interface. This is how ecosystems are built.
This RFC is a proposal, not a decision. Before we begin implementation, the team should discuss and reach agreement on the following:
-
Do we agree on the Vision statement in Section 2? If the vision is wrong, everything that follows may be wrong too. This is the most important question.
-
Is the kernel boundary in Section 4.1 correct? Specifically: should any providers be in the kernel, or should they all be plugins? The argument for keeping 2ā3 providers in the kernel is that the agent is useless without one. The argument for making all providers plugins is architectural purity. Which matters more?
-
What is the right WASM runtime for the plugin system? We currently have
extismin the dependency tree (behindplugins-wasm). Alternatives includewasmtimedirectly (more control, more complexity) andwasm3(interpreter-based, runs on microcontrollers, relevant for the $10 hardware goal). This choice has performance and portability implications. -
What is the migration path for existing users? Users who have configured ZeroClaw against the current binary will need to understand what changes when they upgrade to v0.7.0 and beyond. We need a clear upgrade guide and a commitment to not silently breaking existing configurations.
-
Who owns the plugin registry? The registry is a trusted distribution channel. Who controls what gets published? How are plugins reviewed for security? How are compromised plugins revoked? These are governance questions that need answers before the registry goes live.
-
How do we measure progress? The success metrics in Section 6 are a starting point. What metrics matter most to the team?
-
Do we agree with the versioning strategy in Section 4.4.1? Specifically: are the three crate classes that are excluded from workspace inheritance (
zeroclaw-api, hardware library crates, WIT files) the right exceptions? And is the product-level definition of a breaking change (WIT incompatibility, IPC API incompatibility, config migration, CLI removal) scoped correctly, or are there cases it misses? -
What are the observability defaults for the release kernel binary? Section 4.4.2 recommends keeping
observability-prometheusindefaultand leavingobservability-otelas opt-in. Do we agree? The tradeoff is binary size and startup overhead on constrained hardware versus shipping a production runtime that is observable out of the box. If we change the default, what is the opt-out story for edge deployments?
Terms used in this document that may be unfamiliar:
Big Ball of Mud ā An architecture (or lack thereof) in which the codebase has grown organically without structural planning. The name comes from a 1997 paper by Brian Foote and Joseph Yoder. It is the most common architecture in software, not because anyone chooses it, but because it is what you get by default.
Conway's Law ā "Any organization that designs a system will produce a design whose structure is a mirror image of the organization's communication structure." (Mel Conway, 1968) If contributors work in isolated silos without talking to each other, the code will reflect that. If contributors collaborate with clear interfaces between their work, the code will reflect that too.
Dependency Inversion Principle ā High-level modules should not depend on low-level modules. Both should depend on abstractions. This is why zeroclaw-runtime depends on zeroclaw-api (abstractions) and not on channel-discord (a specific implementation).
Microkernel ā An architecture in which the core system contains only the minimum necessary functionality, and all other capabilities are provided by separate components that communicate with the core through well-defined interfaces.
Strangler Fig Pattern ā A migration strategy in which you incrementally replace parts of an existing system by building new components alongside the old ones. Named for the strangler fig plant, which grows around an existing tree until the original tree has been fully replaced. The key property: the system is always running and always deployable during the migration.
Technical Debt ā The accumulated cost of taking shortcuts in software design. Like financial debt, a small amount can be productive (you ship faster now). A large amount becomes crippling (you spend all your time on interest payments ā i.e., bug fixes and workarounds ā instead of new features).
WIT (WebAssembly Interface Types) ā An interface definition language for describing what WASM components export and import. Think of it as a contract: "a Tool plugin must export a function called execute that takes JSON and returns JSON." WIT makes that contract precise and machine-readable.
These are resources the team may find valuable. They are not required reading, but each one has directly influenced this proposal.
-
"A Philosophy of Software Design" ā John Ousterhout. The best short book on managing complexity in software. His concept of "deep modules" (simple interfaces, powerful implementations) is exactly what the microkernel model aims for.
-
"Clean Architecture" ā Robert C. Martin. The Dependency Rule described in Section 4.2 of this document comes from this book.
-
"Release It!" ā Michael Nygard. Practical patterns for building software that stays up in production. The gateway separation and circuit-breaker patterns discussed here are drawn from this book.
-
The Rust API Guidelines ā https://rust-lang.github.io/api-guidelines/ ā The official guide for designing idiomatic Rust libraries. Our trait interfaces should follow these conventions.
-
The WebAssembly Component Model ā https://component-model.bytecodealliance.org/ ā The technical foundation for the plugin system proposed in this RFC.
-
OpenTelemetry specification ā https://opentelemetry.io/docs/specs/ ā The full specification for the observability standard we are adopting.
This proposal was developed from a detailed analysis of the ZeroClaw codebase at v0.6.8. The code metrics cited are based on direct measurement of the source files. The architectural recommendations reflect established patterns in systems software design applied to the specific constraints and goals of the ZeroClaw project.
Feedback, corrections, and counterproposals are welcome. The best architecture is the one the team understands and believes in ā not the one any single person dictated.