Skip to content

07.1 Agent Turn Cycle

Nikolay Vyahhi edited this page Feb 19, 2026 · 3 revisions

Agent Turn Cycle

Relevant source files

The following files were used as context for generating this wiki page:

Purpose and Scope

This document describes the agent turn cycle — the core reasoning loop that processes a single user message through LLM inference, tool execution, and response generation. A "turn" represents one complete request-response interaction, potentially involving multiple LLM calls and tool invocations before producing a final answer.

This page focuses on the execution mechanics of a single turn. For information about how turns are triggered from different entry points (CLI, channels, gateway), see Message Processing Flow. For details on tool execution security, see Tool Execution. For conversation history persistence, see History Management.

Sources: src/agent/loop_.rs:1-1107


High-Level Turn Flow

A turn begins when a user message enters the system and ends when the agent produces a text-only response. The cycle may execute multiple iterations if the LLM requests tool usage:

User Message → Context Enrichment → LLM Call → Tool Calls? → Execute Tools → Loop
                                                    ↓ No
                                              Final Response

The turn continues iterating until either:

  1. The LLM produces a text-only response (no tool calls)
  2. Maximum iteration limit is reached (default: 10)
  3. An unrecoverable error occurs

Sources: src/agent/loop_.rs:816-1107


Turn Cycle Architecture

The following diagram shows the code entities involved in a complete agent turn:

graph TB
    subgraph "Entry Points"
        CLI["CLI Agent Mode<br/>agent::run()"]
        Gateway["Gateway /webhook<br/>handle_webhook()"]
        Channel["Channel Listener<br/>process_channel_message()"]
        Scheduler["Cron Scheduler<br/>run_agent_job()"]
    end
    
    subgraph "Turn Orchestration"
        AgentTurn["agent_turn()<br/>src/agent/loop_.rs:820"]
        RunLoop["run_tool_call_loop()<br/>src/agent/loop_.rs:851"]
    end
    
    subgraph "Context Building"
        BuildContext["build_context()<br/>loop_.rs:210"]
        BuildHWContext["build_hardware_context()<br/>loop_.rs:237"]
        MemoryRecall["Memory::recall()"]
        HardwareRag["HardwareRag::retrieve()"]
    end
    
    subgraph "LLM Interaction"
        ProviderChat["Provider::chat()<br/>loop_.rs:893"]
        ParseToolCalls["parse_tool_calls()<br/>loop_.rs:595"]
        ParseStructured["parse_structured_tool_calls()<br/>loop_.rs:750"]
    end
    
    subgraph "Tool Execution"
        FindTool["find_tool()<br/>loop_.rs:276"]
        ToolExecute["Tool::execute()"]
        ScrubCreds["scrub_credentials()<br/>loop_.rs:45"]
        BuildResults["Build tool_results string<br/>loop_.rs:993"]
    end
    
    subgraph "History Management"
        HistoryPush["history.push(ChatMessage)"]
        TrimHistory["trim_history()<br/>loop_.rs:116"]
        AutoCompact["auto_compact_history()<br/>loop_.rs:158"]
    end
    
    CLI --> AgentTurn
    Gateway --> AgentTurn
    Channel --> AgentTurn
    Scheduler --> AgentTurn
    
    AgentTurn --> RunLoop
    
    RunLoop --> BuildContext
    RunLoop --> BuildHWContext
    BuildContext --> MemoryRecall
    BuildHWContext --> HardwareRag
    
    RunLoop --> ProviderChat
    ProviderChat --> ParseToolCalls
    ProviderChat --> ParseStructured
    
    RunLoop --> FindTool
    FindTool --> ToolExecute
    ToolExecute --> ScrubCreds
    ScrubCreds --> BuildResults
    
    BuildResults --> HistoryPush
    HistoryPush --> TrimHistory
    TrimHistory --> AutoCompact
    AutoCompact --> RunLoop
Loading

Sources: src/agent/loop_.rs:820-846, src/agent/loop_.rs:851-1107, src/gateway/mod.rs:720-763, src/cron/scheduler.rs:119-149, src/main.rs:546-554


Turn Cycle Phases

Phase 1: Context Enrichment

Before the first LLM call, the system enriches the user message with relevant context:

Memory Context (src/agent/loop_.rs:210-233):

  • Queries memory backend using hybrid search (vector + keyword)
  • Retrieves top 5 relevant entries with score ≥ min_relevance_score
  • Formats as [Memory context]\n- key: content\n

Hardware Context (src/agent/loop_.rs:237-273):

  • Retrieves datasheet chunks from HardwareRag when peripherals are enabled
  • Includes pin alias lookup (e.g., "red_led" → GPIO 13)
  • Formats as [Hardware documentation]\n--- source (board) ---\ncontent\n

Context is prepended to the user message before sending to the LLM.

Sources: src/agent/loop_.rs:210-273


Phase 2: System Prompt Construction

The system prompt is built from multiple sources (handled by calling code, not shown in loop_.rs):

Component Source Purpose
Identity ~/.zeroclaw/identity.md Agent personality and role
Bootstrap ~/.zeroclaw/bootstrap_*.md Environment-specific instructions
Tool Descriptions Tool::description() + parameters_schema() Available capabilities
Security Rules SecurityPolicy Autonomy constraints
Channel Instructions Channel-specific formatting rules Platform conventions

For details on system prompt assembly, see System Prompt Construction.

Sources: Referenced in src/agent/loop_.rs:875-890


Phase 3: LLM Call and Response Parsing

The system constructs a ChatRequest with the conversation history and optional tool specifications:

sequenceDiagram
    participant Loop as run_tool_call_loop()
    participant Provider as Provider::chat()
    participant Parser as parse_tool_calls()
    
    Loop->>Provider: ChatRequest { messages, tools? }
    Provider->>Provider: Call LLM API
    Provider-->>Loop: ChatResponse { text, tool_calls }
    
    alt Native tool calls present
        Loop->>Loop: parse_structured_tool_calls()
    else Text contains tool tags
        Loop->>Parser: parse_tool_calls(text)
        Parser-->>Loop: Vec<ParsedToolCall>
    end
Loading

Tool Call Format Support:

  1. Native OpenAI Format (src/agent/loop_.rs:750-759):

    • Structured ToolCall objects with id, name, arguments
    • Used by providers with native function calling support
  2. XML Tag Format (src/agent/loop_.rs:582-671):

    • <tool_call>{"name": "...", "arguments": {...}}</tool_call>
    • Variants: <toolcall>, <tool-call>, <invoke>
    • Used by models trained on prompt-guided tool use
  3. Markdown Code Block Format (src/agent/loop_.rs:676-709):

    • ```tool_call\n{...}\n```
    • Fallback for OpenRouter models that mix formats
  4. GLM-Style Format (src/agent/loop_.rs:486-580):

    • browser_open/url>https://example.com
    • shell/command>ls -la
    • Proprietary format for GLM models

Security Note: Tool call extraction only processes content within explicit markers (native format, XML tags, markdown blocks, GLM patterns). Raw JSON in message body is never parsed as tool calls to prevent prompt injection attacks where malicious content (emails, files, web pages) could contain JSON mimicking tool calls.

Sources: src/agent/loop_.rs:595-748, src/agent/loop_.rs:750-759, src/agent/loop_.rs:486-580, src/agent/loop_.rs:893-952


Phase 4: Tool Execution

When the LLM response contains tool calls, each is executed in sequence with security checks:

graph LR
    ToolCalls["Parsed Tool Calls"] --> Loop["For Each Call"]
    
    Loop --> Approval["Approval Hook<br/>supervised mode"]
    Approval --> FindTool["find_tool(name)"]
    
    FindTool -->|Found| Execute["Tool::execute(args)"]
    FindTool -->|Not Found| UnknownResult["Unknown tool:<br/>name not in registry"]
    
    Execute -->|Success| Scrub["scrub_credentials(output)"]
    Execute -->|Failure| ErrorResult["Tool error:<br/>execution failed"]
    
    Scrub --> FormatResult["Format as<br/>tool_result XML"]
    UnknownResult --> FormatResult
    ErrorResult --> FormatResult
    
    FormatResult --> Accumulate["Append to<br/>tool_results string"]
    Accumulate --> Loop
Loading

Execution Steps (src/agent/loop_.rs:990-1062):

  1. Approval Check (if ApprovalManager present):

    • Interactive CLI: Prompt user for approval
    • Other channels: Auto-approve
  2. Tool Lookup (src/agent/loop_.rs:276-278):

    • Search tool registry by name
    • Return "Unknown tool" message if not found
  3. Execute (src/agent/loop_.rs:1030-1056):

    • Call Tool::execute(args) with parsed JSON arguments
    • Catch and format errors as tool results
  4. Credential Scrubbing (src/agent/loop_.rs:45-77):

    • Scan output for sensitive patterns: token, api_key, password, secret, bearer, credential
    • Replace matches with <prefix>*[REDACTED]
    • Prevents accidental credential exfiltration to LLM context
  5. Result Formatting:

Sources: src/agent/loop_.rs:990-1073, src/agent/loop_.rs:45-77, src/agent/loop_.rs:276-278


Phase 5: History Update and Loop Decision

After tool execution, results are appended to conversation history:

Native Tool Call History (src/agent/loop_.rs:1063-1082):

{
  "role": "assistant",
  "content": "Let me check that...",
  "tool_calls": [{"id": "tc1", "name": "shell", "arguments": "..."}]
}
{
  "role": "tool",
  "tool_call_id": "tc1",
  "content": "stdout: file.txt"
}

XML Tool Call History (src/agent/loop_.rs:789-808):

Assistant: Let me check that...
<tool_call>
{"id": "tc1", "name": "shell", "arguments": {...}}
</tool_call>

Tool Results:
<tool_result name="shell">
stdout: file.txt
</tool_result>

Loop Decision (src/agent/loop_.rs:960-982):

  • Tool calls present: Continue to next iteration
  • Text-only response: Return final response and exit
  • Max iterations reached: Return error

Sources: src/agent/loop_.rs:960-1107, src/agent/loop_.rs:789-808, src/agent/loop_.rs:1063-1082


Tool Call Parsing Deep Dive

The system supports multiple tool call formats to maximize LLM compatibility:

graph TB
    Response["LLM Response Text"] --> CheckNative["Structured<br/>tool_calls present?"]
    
    CheckNative -->|Yes| ParseStructured["parse_structured_tool_calls()<br/>loop_.rs:750"]
    CheckNative -->|No| TryJSON["Try parse as<br/>OpenAI JSON"]
    
    TryJSON -->|Success| ExtractToolCalls["Extract tool_calls array"]
    TryJSON -->|Fail| FindXML["Find XML tags<br/>tool_call/invoke"]
    
    FindXML -->|Found| ExtractXML["Extract JSON<br/>from tag body"]
    FindXML -->|Not Found| TryMarkdown["Find markdown<br/>```tool_call"]
    
    TryMarkdown -->|Found| ExtractMarkdown["Extract JSON<br/>from code block"]
    TryMarkdown -->|Not Found| TryGLM["Parse GLM format<br/>tool/param>value"]
    
    ParseStructured --> Return["Return<br/>Vec<ParsedToolCall>"]
    ExtractToolCalls --> Return
    ExtractXML --> Return
    ExtractMarkdown --> Return
    TryGLM --> Return
Loading

OpenAI Native JSON Format

Checks for {"tool_calls": [...]} structure in response. Used by OpenAI, Anthropic, OpenRouter with native support.

Sources: src/agent/loop_.rs:600-613, src/agent/loop_.rs:318-347

XML Tag Format

Searches for <tool_call>, <toolcall>, <tool-call>, or <invoke> tags and extracts JSON from the body:

<tool_call>
{"name": "shell", "arguments": {"command": "ls"}}
</tool_call>

Partial Parse Recovery (src/agent/loop_.rs:645-670): If closing tag is missing, attempts to find JSON end using brace balancing.

Sources: src/agent/loop_.rs:616-671

Markdown Code Block Format

Fallback for models that output tool calls in markdown:

```tool_call
{"name": "shell", "arguments": {"command": "ls"}}
```

Sources: src/agent/loop_.rs:676-709

GLM Proprietary Format

Handles GLM-specific line-based format:

shell/command>ls -la
browser_open/url>https://example.com
http_request/url>https://api.example.com

Alias Mapping (src/agent/loop_.rs:491-497):

  • browser_openshell (with curl wrapper)
  • web_searchshell
  • bashshell
  • httphttp_request

Sources: src/agent/loop_.rs:512-580, src/agent/loop_.rs:491-497


History Management

Conversation history is managed to prevent unbounded growth while preserving context:

Trimming Strategy

Simple Trimming (src/agent/loop_.rs:116-132):

  • Preserves system prompt (first message if role=system)
  • Drops oldest non-system messages when count exceeds max_history_messages
  • Applied before each LLM call

Auto-Compaction (src/agent/loop_.rs:158-205):

  • Triggered when non-system message count exceeds threshold (default: 50)
  • Keeps most recent N messages (default: 20)
  • Summarizes older messages using LLM-based summarization

Compaction Process

sequenceDiagram
    participant Loop as run_tool_call_loop()
    participant Compactor as auto_compact_history()
    participant Provider as Provider::chat_with_system()
    
    Loop->>Compactor: Check if compaction needed
    Compactor->>Compactor: Build transcript from<br/>old messages
    Compactor->>Provider: Summarize with system prompt:<br/>"conversation compaction engine"
    Provider-->>Compactor: Summary (max 2000 chars)
    Compactor->>Compactor: Replace old messages<br/>with [Compaction summary]
    Compactor-->>Loop: Continue with compacted history
Loading

Summarization Prompt (src/agent/loop_.rs:186-191):

You are a conversation compaction engine. Summarize older chat history into 
concise context for future turns. Preserve: user preferences, commitments, 
decisions, unresolved tasks, key facts. Omit: filler, repeated chit-chat, 
verbose tool logs. Output plain text bullet points only.

Fallback: If summarization fails, falls back to deterministic truncation to 2000 chars.

Sources: src/agent/loop_.rs:158-205, src/agent/loop_.rs:116-132, src/agent/loop_.rs:134-156


Termination Conditions

The tool call loop terminates under these conditions:

Condition Location Behavior
Text-only response src/agent/loop_.rs:960-982 Return response and exit successfully
Max iterations reached src/agent/loop_.rs:865-869 Return error with iteration count
Provider error src/agent/loop_.rs:942-952 Propagate error immediately
Empty response src/agent/loop_.rs:960-982 Return empty string (valid termination)

Iteration Limit (src/agent/loop_.rs:865-869):

let max_iterations = if max_tool_iterations == 0 {
    DEFAULT_MAX_TOOL_ITERATIONS  // 10
} else {
    max_tool_iterations
};

Sources: src/agent/loop_.rs:960-982, src/agent/loop_.rs:865-869, src/agent/loop_.rs:21-23


Streaming Support

For channels that support progressive updates, the system can stream responses in chunks:

sequenceDiagram
    participant Loop as run_tool_call_loop()
    participant Channel as Channel::update_draft()
    participant User as User Interface
    
    Loop->>Loop: Receive final text response
    Loop->>Loop: Split on whitespace
    
    loop For each word
        Loop->>Loop: Accumulate until<br/>STREAM_CHUNK_MIN_CHARS (80)
        Loop->>Channel: Send chunk via on_delta
        Channel->>User: Update draft message
    end
    
    Loop->>Channel: Send final chunk
    Channel->>User: Finalize message
Loading

Chunk Strategy (src/agent/loop_.rs:964-979):

  • Splits text on whitespace boundaries
  • Accumulates words until chunk size ≥ 80 characters
  • Sends via tokio::sync::mpsc::Sender<String>
  • Allows channels to progressively update draft messages

Supported Channels:

  • Telegram (edit_message_text())
  • Discord (edit_message())
  • Mattermost (update_post())

Sources: src/agent/loop_.rs:18-19, src/agent/loop_.rs:964-979


Security Integration

The agent turn cycle integrates with SecurityPolicy at multiple checkpoints:

Approval Manager Integration

Supervised Mode (src/agent/loop_.rs:996-1024):

  • Intercepts tool calls before execution
  • Prompts for approval on CLI channel
  • Auto-approves on other channels
  • Records decisions with timestamps

Denial Response (src/agent/loop_.rs:1013-1022):

<tool_result name="tool_name">
Denied by user.
</tool_result>

The LLM receives the denial as a tool result and can adapt its response.

Credential Scrubbing

Pattern Detection (src/agent/loop_.rs:25-40):

static SENSITIVE_KEY_PATTERNS: LazyLock<RegexSet> = LazyLock::new(|| {
    RegexSet::new([
        r"(?i)token",
        r"(?i)api[_-]?key",
        r"(?i)password",
        r"(?i)secret",
        r"(?i)user[_-]?key",
        r"(?i)bearer",
        r"(?i)credential",
    ])
    .unwrap()
});

Redaction Strategy (src/agent/loop_.rs:45-77):

  • Preserves first 4 characters for context
  • Replaces remainder with *[REDACTED]
  • Example: api_key: "abcd*[REDACTED]"

This prevents tool output from leaking credentials into LLM context, which could be extracted via prompt injection or appear in future responses.

Sources: src/agent/loop_.rs:996-1024, src/agent/loop_.rs:25-77


Turn Cycle Configuration

Key configuration parameters that control turn behavior:

Parameter Default Location Purpose
max_tool_iterations 10 src/agent/loop_.rs:21-23 Prevents runaway loops
max_history_messages 50 src/agent/loop_.rs:82 Trigger for auto-compaction
COMPACTION_KEEP_RECENT_MESSAGES 20 src/agent/loop_.rs:85 Messages preserved after compaction
STREAM_CHUNK_MIN_CHARS 80 src/agent/loop_.rs:18-19 Minimum chunk size for streaming
min_relevance_score 0.5 (typical) src/agent/loop_.rs:210 Memory recall threshold

Sources: src/agent/loop_.rs:18-92


Testing Strategy

The agent turn cycle has comprehensive test coverage at two levels:

Unit Tests

Located in src/agent/tests.rs:1-876, covering:

  • Simple text responses
  • Single/multi-step tool chains
  • Max-iteration bailout
  • Unknown tool recovery
  • Tool execution failures
  • Parallel tool dispatch
  • History trimming
  • Native vs XML dispatcher paths
  • Empty/whitespace responses
  • Mixed text + tool call responses

E2E Integration Tests

Located in tests/agent_e2e.rs:1-355, validating:

  • Full turn cycle through public API
  • Mock providers and tools without external dependencies
  • Multi-turn conversation coherence
  • Unknown tool recovery
  • Parallel tool dispatch

Mock Infrastructure:

  • ScriptedProvider: Returns pre-scripted responses
  • EchoTool: Simple tool for validation
  • CountingTool: Tracks invocation count
  • FailingTool: Tests error recovery

Sources: src/agent/tests.rs:1-876, tests/agent_e2e.rs:1-355


Common Patterns

Single User Message → Simple Response

User: "What is 2 + 2?"
→ Context enrichment (memory/hardware)
→ LLM call #1
→ Response: "4"
→ Exit (no tools)

User Message → Tool Chain → Response

User: "Check if README.md exists and show its size"
→ LLM call #1 → tool: file_read(README.md)
→ Tool execution → result: "file exists, 1024 bytes"
→ LLM call #2 → text: "README.md exists and is 1024 bytes"
→ Exit

User Message → Parallel Tools → Response

User: "List files and check git status"
→ LLM call #1 → tools: [shell("ls"), git_operations(status)]
→ Tool execution (parallel) → results
→ LLM call #2 → text: "Here are the files: ... Git status: clean"
→ Exit

Sources: Demonstrated in tests/agent_e2e.rs:200-354


Entry Point Reference

The turn cycle can be triggered from multiple entry points:

Entry Point Function File Use Case
CLI Agent agent::run() src/main.rs:546-554 Interactive terminal sessions
Gateway Webhook handle_webhook() src/gateway/mod.rs:621-804 HTTP API requests
Channel Listener process_channel_message() Channels code Telegram, Discord, Slack, etc.
Cron Scheduler run_agent_job() src/cron/scheduler.rs:119-149 Scheduled autonomous tasks

All entry points ultimately call agent_turn() or run_tool_call_loop() with the same core logic.

Sources: src/main.rs:546-554, src/gateway/mod.rs:621-804, src/cron/scheduler.rs:119-149


Clone this wiki locally