-
Notifications
You must be signed in to change notification settings - Fork 4.4k
05.4 Tool Calling Architecture
Relevant source files
The following files were used as context for generating this wiki page:
This document describes ZeroClaw's tool calling architecture, which enables LLM providers to invoke agent capabilities (shell commands, file operations, memory access, etc.) through both native API primitives and prompt-guided fallbacks. The architecture supports 28+ providers with varying levels of tool calling support, from fully native (OpenAI, Anthropic, Gemini) to prompt-guided (Ollama, custom endpoints).
For information about the tool registry and individual tool implementations, see Tools. For provider-specific authentication and configuration, see Providers.
ZeroClaw's tool calling system operates in two modes based on provider capabilities:
-
Native Tool Calling: Providers like OpenAI, Anthropic, and Gemini use API primitives (function calling, tool_use blocks, functionDeclarations) to handle tools. The provider returns structured
ToolCallobjects with IDs that enable precise multi-turn conversations. -
Prompt-Guided Tool Calling: Providers without native support receive tool documentation injected into the system prompt. The agent loop parses tool invocations from the LLM's text response using XML tags, JSON blocks, or provider-specific formats.
Sources: src/providers/traits.rs:195-227, src/agent/loop_.rs:873-890
Every provider implements the Provider trait, which includes capability declaration methods:
fn supports_native_tools(&self) -> bool
fn capabilities(&self) -> ProviderCapabilities
fn convert_tools(&self, tools: &[ToolSpec]) -> ToolsPayloadThe ProviderCapabilities struct declares whether a provider supports native tool calling:
pub struct ProviderCapabilities {
pub native_tool_calling: bool,
}Sources: src/providers/traits.rs:195-227, src/providers/traits.rs:229-249
flowchart TD
AgentLoop["Agent Loop<br/>(run_tool_call_loop)"]
QueryCaps["provider.supports_native_tools()"]
NativeCheck{native_tool_calling?}
NativeMode["Native Mode:<br/>Pass tools via ChatRequest.tools"]
PromptMode["Prompt-Guided Mode:<br/>Inject tools into system prompt"]
ConvertTools["provider.convert_tools(specs)"]
NativePayload["ToolsPayload::OpenAI<br/>ToolsPayload::Anthropic<br/>ToolsPayload::Gemini"]
PromptPayload["ToolsPayload::PromptGuided<br/>{instructions}"]
AgentLoop --> QueryCaps
QueryCaps --> NativeCheck
NativeCheck -->|true| NativeMode
NativeCheck -->|false| PromptMode
NativeMode --> ConvertTools
PromptMode --> ConvertTools
ConvertTools -->|native_tool_calling=true| NativePayload
ConvertTools -->|native_tool_calling=false| PromptPayload
Sources: src/agent/loop_.rs:873-890, src/providers/traits.rs:244-249
OpenAI and OpenRouter use the OpenAI function calling format:
{
"model": "gpt-4",
"messages": [...],
"tools": [
{
"type": "function",
"function": {
"name": "shell",
"description": "Execute a shell command",
"parameters": { "type": "object", "properties": {...} }
}
}
]
}The LLM response includes structured tool_calls:
{
"choices": [{
"message": {
"content": "Let me check the date.",
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "shell",
"arguments": "{\"command\": \"date\"}"
}
}
]
}
}]
}Sources: src/providers/openai.rs:57-77, src/providers/openai.rs:92-105, src/providers/openrouter.rs:44-76
Anthropic uses content blocks with tool_use and tool_result types:
{
"model": "claude-3-5-sonnet-20241022",
"messages": [...],
"tools": [
{
"name": "shell",
"description": "Execute a shell command",
"input_schema": { "type": "object", "properties": {...} }
}
]
}Response with tool calls:
{
"content": [
{
"type": "text",
"text": "Let me check the date."
},
{
"type": "tool_use",
"id": "toolu_123",
"name": "shell",
"input": { "command": "date" }
}
]
}Tool results are sent back as user messages with tool_result blocks:
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_123",
"content": "Fri Jan 10 14:30:00 UTC 2025"
}
]
}Sources: src/providers/anthropic.rs:45-95, src/providers/anthropic.rs:126-145, src/providers/anthropic.rs:209-230
Gemini uses functionDeclarations in the tools field:
{
"model": "models/gemini-2.0-flash-exp",
"contents": [...],
"tools": [
{
"functionDeclarations": [
{
"name": "shell",
"description": "Execute a shell command",
"parameters": { "type": "OBJECT", "properties": {...} }
}
]
}
]
}Sources: src/providers/gemini.rs:1-11 (Note: Full native tool calling support is referenced but implementation details are in the file)
For providers with native tool calling, the agent loop stores tool calls and results in a structured format that preserves call IDs:
Assistant message with tool calls:
{
"content": "Let me check that for you.",
"tool_calls": [
{
"id": "call_abc123",
"name": "shell",
"arguments": "{\"command\": \"date\"}"
}
]
}Tool result message:
{
"tool_call_id": "call_abc123",
"content": "Fri Jan 10 14:30:00 UTC 2025"
}This JSON is stored in the content field of ChatMessage and parsed by provider-specific message converters.
Sources: src/agent/loop_.rs:764-787, src/agent/loop_.rs:789-808, src/providers/openai.rs:169-234, src/providers/anthropic.rs:232-261
When supports_native_tools() returns false, the agent loop injects tool documentation into the system prompt:
// From Provider::chat default implementation
if let Some(tools) = request.tools {
if !tools.is_empty() && !self.supports_native_tools() {
let tool_instructions = match self.convert_tools(tools) {
ToolsPayload::PromptGuided { instructions } => instructions,
_ => bail!("Expected PromptGuided payload")
};
// Inject into system message
system_message.content.push_str("\n\n");
system_message.content.push_str(&tool_instructions);
}
}The build_tool_instructions_text function generates documentation like:
You have access to the following tools:
## shell
Execute a shell command in the workspace.
Parameters:
- command (string, required): The shell command to execute
To use a tool, respond with:
<tool_call>
{"name": "shell", "arguments": {"command": "date"}}
</tool_call>
Sources: src/providers/traits.rs:305-327, src/providers/traits.rs:362-397
The agent loop parses tool calls from LLM responses using multiple strategies, tried in order:
Minimax and some OpenAI-compatible providers return tool calls in native JSON format:
{
"content": "Let me check that.",
"tool_calls": [
{
"id": "call_123",
"function": {
"name": "shell",
"arguments": "{\"command\": \"date\"}"
}
}
]
}Sources: src/agent/loop_.rs:600-613, src/agent/loop_.rs:289-316
The primary prompt-guided format uses XML-style tags. Supported tags: <tool_call>, <toolcall>, <tool-call>, <invoke>:
Let me check the current date.
<tool_call>
{"name": "shell", "arguments": {"command": "date"}}
</tool_call>
The result will show the current time.
The parser extracts JSON from within tags and handles unclosed tags gracefully:
Sources: src/agent/loop_.rs:349-365, src/agent/loop_.rs:582-671
Models behind OpenRouter sometimes output tool calls in markdown code blocks:
Let me check that for you.
```tool_call
{"name": "shell", "arguments": {"command": "date"}}
```
The regex pattern r"(?s)```(?:tool[_-]?call|invoke)\s*\n(.*?)(?:```|</tool[_-]?call>)" matches hybrid formats.
Sources: src/agent/loop_.rs:676-709
GLM (Zhipu) models use proprietary line-based formats:
browser_open/url>https://example.com
shell/command>ls -la
http_request/url>https://api.example.com
The parser maps aliases (browser_open → shell), constructs arguments, and converts to standard format:
Sources: src/agent/loop_.rs:486-580
The parser does not extract arbitrary JSON from responses to prevent prompt injection attacks. Tool calls must be wrapped in one of the recognized formats:
// SECURITY: We do NOT fall back to extracting arbitrary JSON from the response
// here. That would enable prompt injection attacks where malicious content
// (e.g., in emails, files, or web pages) could include JSON that mimics a
// tool call.This prevents an attacker from injecting tool call payloads into content that the LLM reads (e.g., email bodies, web pages, file contents).
Sources: src/agent/loop_.rs:732-740
sequenceDiagram
participant Loop as "run_tool_call_loop"
participant Provider as "Provider::chat"
participant Parser as "parse_tool_calls<br/>parse_structured_tool_calls"
participant Tool as "Tool Registry"
participant Security as "ApprovalManager"
Loop->>Provider: "chat(messages, tools, model, temp)"
Provider->>Provider: "Check supports_native_tools()"
alt Native Tools Supported
Provider->>Provider: "Send tools in API request"
Provider-->>Loop: "ChatResponse{tool_calls: [...]}"
Loop->>Parser: "parse_structured_tool_calls(resp.tool_calls)"
else Prompt-Guided
Provider->>Provider: "Inject tools into system prompt"
Provider-->>Loop: "ChatResponse{text: '...'}"
Loop->>Parser: "parse_tool_calls(resp.text)"
end
Parser-->>Loop: "Vec<ParsedToolCall>"
loop For each ParsedToolCall
Loop->>Security: "needs_approval(tool_name)?"
alt Approval Required
Security-->>Loop: "Prompt user (CLI) or auto-approve (channels)"
end
Loop->>Tool: "find_tool(registry, name)"
Tool-->>Loop: "dyn Tool"
Loop->>Tool: "tool.execute(arguments)"
Tool-->>Loop: "ToolResult{success, output}"
Loop->>Loop: "scrub_credentials(output)"
Loop->>Loop: "Append to history"
end
alt Has Tool Calls
Loop->>Loop: "Continue loop (next iteration)"
else Text Only
Loop->>Loop: "Return final text response"
end
Sources: src/agent/loop_.rs:851-1094
Both providers use identical native tool calling formats. Message conversion logic reconstructs tool_calls from JSON-encoded assistant messages:
fn convert_messages(messages: &[ChatMessage]) -> Vec<NativeMessage> {
messages.iter().map(|m| {
if m.role == "assistant" {
if let Ok(value) = serde_json::from_str::<serde_json::Value>(&m.content) {
if let Some(tool_calls_value) = value.get("tool_calls") {
// Parse and reconstruct NativeToolCall structs
}
}
}
// ... handle tool results similarly
})
}Sources: src/providers/openai.rs:169-234, src/providers/openrouter.rs:138-203
Anthropic's message converter parses the native assistant history format to reconstruct tool_use and tool_result content blocks:
fn convert_messages(messages: &[ChatMessage]) -> (Option<SystemPrompt>, Vec<NativeMessage>) {
for msg in messages {
match msg.role.as_str() {
"assistant" => {
if let Some(blocks) = Self::parse_assistant_tool_call_message(&msg.content) {
native_messages.push(NativeMessage {
role: "assistant",
content: blocks, // Vec<NativeContentOut::ToolUse>
});
}
}
"tool" => {
if let Some(tool_result) = Self::parse_tool_result_message(&msg.content) {
native_messages.push(tool_result);
}
}
}
}
}Anthropic also supports prompt caching via cache_control fields on tool definitions, system prompts, and conversation messages.
Sources: src/providers/anthropic.rs:232-282, src/providers/anthropic.rs:284-350
The OpenAiCompatibleProvider implements native tool calling for 20+ providers (Venice, Groq, Mistral, DeepSeek, xAI, etc.) using the OpenAI function calling format:
fn convert_tools(tools: Option<&[ToolSpec]>) -> Option<Vec<serde_json::Value>> {
tools.map(|items| {
items.iter().map(|tool| {
serde_json::json!({
"type": "function",
"function": {
"name": tool.name,
"description": tool.description,
"parameters": tool.parameters
}
})
}).collect()
})
}The provider handles responses with reasoning_content fallback for thinking models (Qwen3, GLM-4):
fn effective_content(&self) -> String {
match &self.content {
Some(c) if !c.is_empty() => c.clone(),
_ => self.reasoning_content.clone().unwrap_or_default(),
}
}Sources: src/providers/compatible.rs:187-201, src/providers/compatible.rs:234-262
Ollama returns false for supports_native_tools(), forcing prompt-guided mode. However, some Ollama models (especially those fine-tuned on tool calling) return native tool_calls in responses. The provider converts these to JSON format that parse_tool_calls understands:
fn format_tool_calls_for_loop(&self, tool_calls: &[OllamaToolCall]) -> String {
let formatted_calls: Vec<serde_json::Value> = tool_calls.iter().map(|tc| {
let (tool_name, tool_args) = self.extract_tool_name_and_args(tc);
serde_json::json!({
"id": tc.id,
"type": "function",
"function": {
"name": tool_name,
"arguments": serde_json::to_string(&tool_args).unwrap_or("{}".to_string())
}
})
}).collect();
serde_json::json!({
"content": "",
"tool_calls": formatted_calls
}).to_string()
}The provider also handles quirky model behavior where tool calls are wrapped:
{"name": "tool_call", "arguments": {"name": "shell", "arguments": {...}}}{"name": "tool.shell", "arguments": {...}}
Sources: src/providers/ollama.rs:191-217, src/providers/ollama.rs:219-256, src/providers/ollama.rs:369-375
The agent loop executes tool calls recursively until the LLM produces a text-only response:
for _iteration in 0..max_iterations {
let (response_text, parsed_text, tool_calls, assistant_history_content, native_tool_calls) =
provider.chat(ChatRequest { messages: history, tools: request_tools }, model, temperature).await?;
if tool_calls.is_empty() {
// No tool calls — final response
history.push(ChatMessage::assistant(response_text));
return Ok(display_text);
}
// Execute each tool call
let mut tool_results = String::new();
for call in &tool_calls {
let result = find_tool(tools_registry, &call.name)?.execute(call.arguments.clone()).await?;
let scrubbed = scrub_credentials(&result.output);
tool_results.push_str(&format!("<tool_result name=\"{}\">\n{}\n</tool_result>", call.name, scrubbed));
}
// Append tool calls and results to history
history.push(ChatMessage::assistant(assistant_history_content));
history.push(ChatMessage::user(tool_results));
}The loop terminates when:
- The LLM produces text without tool calls
-
max_tool_iterationsis reached (default: 10) - A tool execution error occurs (non-retryable)
Sources: src/agent/loop_.rs:851-1094, src/agent/loop_.rs:960-1094
| Provider | Native Tools | Format | Notes |
|---|---|---|---|
| OpenAI | ✅ | OpenAI function calling | Supports tool_calls array with IDs |
| Anthropic | ✅ |
tool_use content blocks |
Supports tool_result blocks with IDs |
| OpenRouter | ✅ | OpenAI function calling | Proxies native tool support from backend models |
| Gemini | ✅ | functionDeclarations |
Uses functionCall responses |
| Venice | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| Groq | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| Mistral | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| DeepSeek | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| xAI (Grok) | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| Moonshot (Kimi) | ✅ | OpenAI-compatible | Via OpenAiCompatibleProvider
|
| GLM (Zhipu) | ✅ | OpenAI-compatible + GLM-style | Dual format support |
| MiniMax | ✅ | Native JSON tool_calls
|
Parsed via Strategy 1 |
| Ollama | ❌ | Prompt-guided + quirk handling | Converts model tool_calls to JSON |
| Custom | ❌ | Prompt-guided | XML tags, markdown blocks |
Sources: src/providers/openai.rs:257-377, src/providers/anthropic.rs:387-466, src/providers/openrouter.rs:228-403, src/providers/compatible.rs:552-1061, src/providers/ollama.rs:259-375
For native tool calling, the agent loop stores assistant messages with structured tool call data:
let assistant_history_content = if resp.tool_calls.is_empty() {
response_text.clone()
} else {
build_native_assistant_history(&response_text, &resp.tool_calls)
};
history.push(ChatMessage::assistant(assistant_history_content));This preserves call IDs and enables proper role: tool responses.
For prompt-guided providers, assistant messages store XML-wrapped tool calls:
fn build_assistant_history_with_tool_calls(text: &str, tool_calls: &[ToolCall]) -> String {
let mut parts = Vec::new();
if !text.trim().is_empty() {
parts.push(text.trim().to_string());
}
for call in tool_calls {
parts.push(format!("<tool_call>\n{}\n</tool_call>", serde_json::to_string(&call)?));
}
parts.join("\n")
}Sources: src/agent/loop_.rs:927-931, src/agent/loop_.rs:789-808
Tool outputs are scrubbed before being sent back to the LLM to prevent credential exfiltration:
fn scrub_credentials(input: &str) -> String {
SENSITIVE_KV_REGEX.replace_all(input, |caps: ®ex::Captures| {
let key = &caps[1];
let val = /* extract value */;
let prefix = if val.len() > 4 { &val[..4] } else { "" };
format!("{}: {}*[REDACTED]", key, prefix)
}).to_string()
}Patterns matched: token, api_key, password, secret, bearer, credential.
Sources: src/agent/loop_.rs:25-77, src/agent/loop_.rs:1039-1044
-
Unified Interface: The
Providertrait abstracts tool calling differences across 28+ providers throughsupports_native_tools()andconvert_tools(). -
Multiple Parsing Strategies: The agent loop parses tool calls from native JSON, XML tags, markdown blocks, and GLM-style formats in a single unified flow.
-
Security by Design: Tool calls must be wrapped in recognized formats to prevent prompt injection. Credentials are scrubbed from tool outputs before being sent to the LLM.
-
History Fidelity: Native providers preserve tool call IDs for multi-turn conversations. Prompt-guided providers use XML-wrapped JSON for history reconstruction.
-
Graceful Degradation: Providers without native support fall back to prompt-guided mode with tool documentation injected into the system prompt.
-
Provider-Specific Quirks: Ollama converts native tool_calls to JSON format. GLM supports dual formats. Anthropic uses content blocks with caching.
Sources: src/agent/loop_.rs:851-1094, src/providers/traits.rs:195-397, src/providers/mod.rs:572-840