-
Notifications
You must be signed in to change notification settings - Fork 4.4k
03 Core Architecture
Relevant source files
The following files were used as context for generating this wiki page:
This page documents ZeroClaw's high-level system design, showing how the five core components—Agent, Channels, Providers, Tools, and Memory—interact to process user messages and execute tool calls. It covers the trait-driven architecture, the message processing lifecycle, and the configuration system that orchestrates initialization.
For detailed information on specific subsystems:
- Trait implementations and pluggability: See Trait-Driven Design
- Security layers and enforcement: See Security Model
- Complete message processing flow: See Message Processing Flow
- Configuration file format: See Configuration
ZeroClaw is built around five core components that communicate through well-defined trait interfaces:
| Component | Trait | Purpose | Key Implementations |
|---|---|---|---|
| Agent Core | — | Orchestrates tool call loops and conversation history | agent::loop_::run_tool_call_loop |
| Channels | Channel |
Ingest messages from communication platforms |
TelegramChannel, DiscordChannel, CliChannel, EmailChannel
|
| Providers | Provider |
Abstract LLM API calls with retry/fallback logic |
ReliableProvider, OpenAiProvider, AnthropicProvider, OllamaProvider
|
| Tools | Tool |
Execute actions in response to LLM tool calls |
ShellTool, FileReadTool, MemoryStoreTool, GitStatusTool
|
| Memory | Memory |
Persist and retrieve conversation context |
SqliteMemory, PostgresMemory, LucidMemory, MarkdownMemory
|
| Security | SecurityPolicy |
Enforce autonomy levels and path restrictions | SecurityPolicy::from_config |
| Runtime | RuntimeAdapter |
Isolate tool execution |
NativeRuntime, DockerRuntime
|
Sources: src/channels/mod.rs:30-31, src/agent/loop_.rs:1-17, src/config/schema.rs:48-144, README.md:308-322
graph TB
subgraph "Entry Points"
CLI["CLI Commands<br/>(main.rs)"]
Gateway["HTTP Gateway<br/>(gateway::mod)"]
Channels["Channels<br/>(channels::mod)"]
end
subgraph "Core Orchestration"
Config["Config<br/>(config::schema::Config)"]
Agent["Agent Core<br/>(agent::loop_::run_tool_call_loop)"]
Security["SecurityPolicy<br/>(security::SecurityPolicy)"]
end
subgraph "Trait-Based Subsystems"
Provider["Provider Trait<br/>Arc<dyn Provider>"]
Memory["Memory Trait<br/>Arc<dyn Memory>"]
Tool["Tool Trait<br/>Vec<Box<dyn Tool>>"]
Runtime["RuntimeAdapter Trait<br/>Arc<dyn RuntimeAdapter>"]
end
subgraph "External Interfaces"
LLM["LLM APIs<br/>(OpenRouter, OpenAI, etc.)"]
Storage["Storage<br/>(SQLite, PostgreSQL)"]
Messaging["Messaging APIs<br/>(Telegram Bot API, etc.)"]
end
CLI --> Agent
Gateway --> Agent
Channels --> Agent
Config -.initializes.-> Agent
Config -.initializes.-> Channels
Config -.initializes.-> Provider
Config -.initializes.-> Memory
Config -.initializes.-> Security
Config -.initializes.-> Runtime
Agent --> Provider
Agent --> Tool
Agent --> Memory
Agent --> Security
Security -.enforces.-> Agent
Security -.enforces.-> Tool
Security -.enforces.-> Gateway
Provider --> LLM
Memory --> Storage
Channels --> Messaging
Tool --> Runtime
Sources: src/channels/mod.rs:33-51, src/agent/loop_.rs:1-16, src/config/schema.rs:48-144, src/main.rs:79-245
The ChannelRuntimeContext is the central orchestration object that holds references to all subsystems needed for message processing:
graph LR
subgraph "ChannelRuntimeContext<br/>(channels::mod)"
CTX["ChannelRuntimeContext"]
CTX --> CHANNELS["channels_by_name<br/>Arc<HashMap<String, Arc<dyn Channel>>>"]
CTX --> PROVIDER["provider<br/>Arc<dyn Provider>"]
CTX --> MEMORY["memory<br/>Arc<dyn Memory>"]
CTX --> TOOLS["tools_registry<br/>Arc<Vec<Box<dyn Tool>>>"]
CTX --> OBSERVER["observer<br/>Arc<dyn Observer>"]
CTX --> SYSTEM["system_prompt<br/>Arc<String>"]
CTX --> HISTORY["conversation_histories<br/>ConversationHistoryMap"]
CTX --> ROUTES["route_overrides<br/>RouteSelectionMap"]
CTX --> PCACHE["provider_cache<br/>ProviderCacheMap"]
end
Sources: src/channels/mod.rs:101-123
Messages from all channels funnel into a unified dispatch loop with concurrency control:
sequenceDiagram
participant CH as Channel Implementation
participant TX as Message Channel<br/>(tokio::mpsc)
participant DL as Dispatch Loop<br/>(run_message_dispatch_loop)
participant SEM as Semaphore<br/>(tokio::sync::Semaphore)
participant WK as Worker Task<br/>(process_channel_message)
CH->>TX: send(ChannelMessage)
TX->>DL: recv()
DL->>SEM: acquire_owned()
SEM-->>DL: permit
DL->>WK: spawn(process_channel_message)
Note over WK: Runtime command handling<br/>/models, /model
WK->>WK: build_memory_context()
WK->>WK: run_tool_call_loop()
WK->>WK: save conversation history
WK->>CH: send response
WK->>SEM: drop permit
Sources: src/channels/mod.rs:816-844, src/channels/mod.rs:556-814, src/channels/mod.rs:146-184
The core agent loop implements an iterative tool execution cycle until the LLM produces a text-only response:
flowchart TD
START([User Message]) --> BUILD_CONTEXT["build_context()<br/>Memory recall + hardware RAG"]
BUILD_CONTEXT --> BUILD_HISTORY["Build conversation history<br/>system prompt + user turns"]
BUILD_HISTORY --> LOOP_START{{"Tool Call Loop<br/>(max_tool_iterations)"}}
LOOP_START --> LLM_CALL["provider.chat()<br/>Send messages + tools"]
LLM_CALL --> PARSE_RESPONSE["Parse response<br/>Extract tool_calls"]
PARSE_RESPONSE --> HAS_TOOLS{Has tool calls?}
HAS_TOOLS -->|Yes| EXECUTE_TOOLS["Execute each tool call<br/>- Security check<br/>- Runtime execution<br/>- Credential scrubbing"]
EXECUTE_TOOLS --> APPEND_RESULTS["Append tool results<br/>to conversation history"]
APPEND_RESULTS --> CHECK_ITER{Reached<br/>max iterations?}
CHECK_ITER -->|No| LOOP_START
CHECK_ITER -->|Yes| FORCE_EXIT["Force text response"]
HAS_TOOLS -->|No| TEXT_RESPONSE["Text response only"]
TEXT_RESPONSE --> COMPACT{History too long?}
FORCE_EXIT --> COMPACT
COMPACT -->|Yes| AUTO_COMPACT["auto_compact_history()<br/>LLM-based summarization"]
COMPACT -->|No| SAVE_MEMORY
AUTO_COMPACT --> SAVE_MEMORY["Save to memory<br/>(if auto_save enabled)"]
SAVE_MEMORY --> RETURN([Return final response])
Sources: src/agent/loop_.rs:688-933, src/agent/loop_.rs:158-205, src/agent/loop_.rs:207-233
Each tool call goes through multiple validation and execution stages:
sequenceDiagram
participant AGENT as Agent Loop
participant SEC as SecurityPolicy
participant TOOL as Tool Implementation
participant RT as RuntimeAdapter
participant SCRUB as Credential Scrubber
AGENT->>AGENT: parse_tool_call()
AGENT->>SEC: can_act()?
SEC-->>AGENT: Approved/Denied
alt Denied by SecurityPolicy
AGENT->>AGENT: Generate denial message
else Approved
AGENT->>TOOL: execute(arguments)
TOOL->>RT: Execute in sandbox
RT-->>TOOL: Raw output
TOOL-->>AGENT: Tool result
AGENT->>SCRUB: scrub_credentials()
SCRUB-->>AGENT: Sanitized result
AGENT->>AGENT: Append to history
end
Sources: src/agent/loop_.rs:935-1089, src/agent/loop_.rs:42-77, src/security/mod.rs (referenced)
ZeroClaw wraps all LLM providers in a resilience layer that handles retries, fallbacks, and API key rotation:
graph TB
subgraph "Factory Layer"
FACTORY["providers::create_resilient_provider_with_options()"]
end
subgraph "Resilience Wrapper"
RELIABLE["ReliableProvider<br/>(retry + fallback + key rotation)"]
end
subgraph "Provider Implementations"
OPENAI["OpenAiProvider"]
ANTHROPIC["AnthropicProvider"]
OPENROUTER["OpenRouterProvider"]
GEMINI["GeminiProvider"]
OLLAMA["OllamaProvider"]
COMPATIBLE["OpenAiCompatibleProvider<br/>(Venice, Groq, Mistral, etc.)"]
end
FACTORY --> RELIABLE
RELIABLE --> OPENAI
RELIABLE --> ANTHROPIC
RELIABLE --> OPENROUTER
RELIABLE --> GEMINI
RELIABLE --> OLLAMA
RELIABLE --> COMPATIBLE
Sources: src/providers/mod.rs (referenced), src/channels/mod.rs:290-308, src/gateway/mod.rs:300-310
Channels support per-sender provider/model switching via runtime commands:
| Command | Function | State Storage |
|---|---|---|
/models |
List available providers | — |
/models <provider> |
Switch provider | RouteSelectionMap |
/model |
Show current model | — |
/model <model-id> |
Switch model |
RouteSelectionMap + clear history |
Sources: src/channels/mod.rs:142-144, src/channels/mod.rs:146-184, src/channels/mod.rs:365-441
flowchart TD
START([Program Start]) --> PARSE_ARGS["Parse CLI Arguments<br/>(clap)"]
PARSE_ARGS --> CMD_CHECK{Command?}
CMD_CHECK -->|onboard| ONBOARD["Onboarding Wizard<br/>(onboard::run_wizard)"]
CMD_CHECK -->|Other| LOAD_CONFIG["Config::load_or_init()"]
ONBOARD --> SAVE["Config::save()"]
LOAD_CONFIG --> CHECK_FILE{config.toml<br/>exists?}
CHECK_FILE -->|No| INIT_DEFAULT["Create with defaults"]
CHECK_FILE -->|Yes| PARSE_TOML["Parse TOML"]
PARSE_TOML --> DECRYPT{secrets.encrypt?}
DECRYPT -->|Yes| DECRYPT_SECRETS["SecretStore::decrypt<br/>(ChaCha20Poly1305)"]
DECRYPT -->|No| APPLY_ENV
DECRYPT_SECRETS --> APPLY_ENV["apply_env_overrides()<br/>(OPENROUTER_API_KEY, etc.)"]
INIT_DEFAULT --> APPLY_ENV
APPLY_ENV --> INIT_SUBSYSTEMS["Initialize Subsystems"]
INIT_SUBSYSTEMS --> INIT_PROVIDER["create_resilient_provider"]
INIT_SUBSYSTEMS --> INIT_MEMORY["create_memory_with_storage"]
INIT_SUBSYSTEMS --> INIT_TOOLS["all_tools_with_runtime"]
INIT_SUBSYSTEMS --> INIT_SECURITY["SecurityPolicy::from_config"]
INIT_SUBSYSTEMS --> INIT_RUNTIME["create_runtime"]
CMD_CHECK -->|agent| AGENT_MODE["agent::run()"]
CMD_CHECK -->|gateway| GATEWAY_MODE["gateway::run_gateway()"]
CMD_CHECK -->|daemon| DAEMON_MODE["daemon::run()"]
SAVE --> AGENT_MODE
INIT_PROVIDER --> AGENT_MODE
INIT_PROVIDER --> GATEWAY_MODE
INIT_PROVIDER --> DAEMON_MODE
Sources: src/main.rs:476-609, src/config/mod.rs (referenced), src/config/schema.rs:48-144
pub struct Config {
// Core configuration
pub workspace_dir: PathBuf,
pub config_path: PathBuf,
pub api_key: Option<String>,
pub api_url: Option<String>,
pub default_provider: Option<String>,
pub default_model: Option<String>,
pub default_temperature: f64,
// Subsystem configuration
pub autonomy: AutonomyConfig,
pub runtime: RuntimeConfig,
pub reliability: ReliabilityConfig,
pub agent: AgentConfig,
pub channels_config: ChannelsConfig,
pub memory: MemoryConfig,
pub gateway: GatewayConfig,
pub composio: ComposioConfig,
pub secrets: SecretsConfig,
pub browser: BrowserConfig,
pub tunnel: TunnelConfig,
// ... and more
}Sources: src/config/schema.rs:48-144
The HTTP gateway provides webhook and WebSocket interfaces with security-first defaults:
graph TB
subgraph "AppState<br/>(gateway::mod::AppState)"
STATE["AppState"]
STATE --> CONFIG["config<br/>Arc<Mutex<Config>>"]
STATE --> PROVIDER["provider<br/>Arc<dyn Provider>"]
STATE --> MEMORY["mem<br/>Arc<dyn Memory>"]
STATE --> PAIRING["pairing<br/>Arc<PairingGuard>"]
STATE --> RATE["rate_limiter<br/>Arc<GatewayRateLimiter>"]
STATE --> IDEMPOTENCY["idempotency_store<br/>Arc<IdempotencyStore>"]
STATE --> WHATSAPP["whatsapp<br/>Option<Arc<WhatsAppChannel>>"]
STATE --> OBSERVER["observer<br/>Arc<dyn Observer>"]
end
subgraph "Routes"
HEALTH["/health<br/>(handle_health)"]
METRICS["/metrics<br/>(handle_metrics)"]
PAIR["/pair<br/>(handle_pair)"]
WEBHOOK["/webhook<br/>(handle_webhook)"]
WHATSAPP_VERIFY["/whatsapp GET<br/>(handle_whatsapp_verify)"]
WHATSAPP_MSG["/whatsapp POST<br/>(handle_whatsapp_message)"]
end
STATE --> HEALTH
STATE --> METRICS
STATE --> PAIR
STATE --> WEBHOOK
STATE --> WHATSAPP_VERIFY
STATE --> WHATSAPP_MSG
Sources: src/gateway/mod.rs:260-279, src/gateway/mod.rs:482-496, src/gateway/mod.rs:512-605
| Layer | Implementation | Configuration |
|---|---|---|
| Network Binding |
127.0.0.1 default, refuses 0.0.0.0 without tunnel |
gateway.host, gateway.allow_public_bind
|
| Pairing | 6-digit one-time code → bearer token |
gateway.require_pairing, PairingGuard
|
| Rate Limiting | Sliding window per client key |
gateway.pair_rate_limit_per_minute, gateway.webhook_rate_limit_per_minute
|
| Idempotency | TTL-based duplicate detection | gateway.idempotency_ttl_secs |
| Signature Verification | HMAC-SHA256 for WhatsApp webhooks | channels_config.whatsapp.app_secret |
| Body Size Limit | 64KB max request body |
MAX_BODY_SIZE (tower middleware) |
| Request Timeout | 30s timeout on all requests |
REQUEST_TIMEOUT_SECS (tower middleware) |
Sources: src/gateway/mod.rs:37-45, src/gateway/mod.rs:66-158, src/gateway/mod.rs:283-293, src/security/pairing.rs (referenced)
graph LR
subgraph "Factory"
FACTORY["memory::create_memory_with_storage()"]
end
subgraph "Memory Implementations"
SQLITE["SqliteMemory<br/>(FTS5 + vector)"]
POSTGRES["PostgresMemory"]
LUCID["LucidMemory<br/>(external binary)"]
MARKDOWN["MarkdownMemory<br/>(file-based)"]
NONE["NoopMemory"]
end
FACTORY -->|backend = sqlite| SQLITE
FACTORY -->|backend = postgres| POSTGRES
FACTORY -->|backend = lucid| LUCID
FACTORY -->|backend = markdown| MARKDOWN
FACTORY -->|backend = none| NONE
Sources: src/memory/mod.rs (referenced), src/channels/mod.rs:443-469, src/agent/loop_.rs:207-233
The agent loop interacts with memory at three key points:
-
Context Recall:
build_context()retrieves relevant memories before LLM call -
Auto-Save: Stores user messages when
memory.auto_save = true -
History Compaction: Triggers summarization when history exceeds
agent.max_history_messages
Sources: src/agent/loop_.rs:207-233, src/channels/mod.rs:588-608, src/agent/loop_.rs:158-205
Channels use a supervised restart pattern with exponential backoff:
stateDiagram-v2
[*] --> Running: spawn_supervised_listener()
Running --> CheckHealth: listen() exits
CheckHealth --> Success: Ok(())
CheckHealth --> Error: Err(e)
Success --> Sleep: Reset backoff
Error --> Sleep: Double backoff
Sleep --> Running: Restart
note right of Running
mark_component_ok()
end note
note right of Error
mark_component_error()
bump_component_restart()
end note
Sources: src/channels/mod.rs:471-509, src/health/mod.rs (referenced)
| Component | Mechanism | Configuration |
|---|---|---|
| Channel Dispatch | Semaphore-based parallelism | CHANNEL_PARALLELISM_PER_CHANNEL = 4 |
| Max In-Flight Messages | Dynamic scaling per channel count |
8 to 64 messages |
| Typing Indicator | Token-based task cancellation | CHANNEL_TYPING_REFRESH_INTERVAL_SECS = 4 |
| Message Timeout | Per-message timeout for LLM + tools | CHANNEL_MESSAGE_TIMEOUT_SECS = 300 |
Sources: src/channels/mod.rs:61-69, src/channels/mod.rs:511-518, src/channels/mod.rs:526-554, src/channels/mod.rs:698-716
ZeroClaw's architecture is defined by:
- Trait-driven modularity: Every major subsystem (Provider, Channel, Tool, Memory, Runtime) uses traits for pluggability
-
Config-driven initialization: The
Configstruct orchestrates all subsystem creation with secure defaults -
Unified message processing: All channels funnel into
process_channel_message()→run_tool_call_loop() - Multi-layer security: Authentication, authorization, and isolation enforced at every boundary
- Resilient operation: Supervised restart, retry logic, and error handling at the provider and channel layers
The result is a system where every component can be swapped via configuration changes without code modifications, while maintaining defense-in-depth security and operational resilience.
Sources: src/channels/mod.rs, src/agent/loop_.rs, src/config/schema.rs, src/gateway/mod.rs, src/main.rs, README.md