Problem
When OpenClaw sends requests to Ollama through the Manifest router in streaming mode (the default), token usage is logged as 0|0 in the Manifest database. Non-streaming requests log tokens correctly.
Evidence
Querying agent_messages in the local Manifest DB:
| input_tokens |
output_tokens |
model |
timestamp |
mode |
| 31 |
10 |
qwen2.5:7b |
19:07:57 |
non-streaming (curl test) |
| 0 |
0 |
qwen2.5:7b |
19:06:19 |
streaming (OpenClaw) |
| 0 |
0 |
qwen2.5:7b |
18:46:40 |
streaming (OpenClaw) |
| 0 |
0 |
qwen2.5:7b |
18:26:41 |
streaming (OpenClaw) |
All real OpenClaw requests (streaming) show 0|0. The only entry with actual token counts is a manual curl with "stream": false.
Root cause
Ollama's OpenAI-compatible streaming endpoint includes token usage only in the final SSE chunk (with "finish_reason": "stop"). Example final chunk:
{"usage": {"prompt_tokens": 31, "completion_tokens": 10, "total_tokens": 41}}
Manifest's proxy likely logs token counts before processing the final chunk of the stream, or doesn't parse the usage field from the last SSE event.
Verification
Direct Ollama calls (non-streaming) return tokens correctly at all levels:
Ollama native API: prompt_eval_count=31, eval_count=10 ✅
Ollama OpenAI /v1: prompt_tokens=31, completion_tokens=10 ✅
Manifest router /v1: prompt_tokens=31, completion_tokens=10 ✅ (non-streaming)
Manifest router /v1: 0, 0 ❌ (streaming)
Expected behavior
Manifest should capture the usage field from the final SSE chunk of a streaming response and log it in agent_messages.input_tokens / output_tokens.
Impact
- Dashboard shows 0 token consumption for all Ollama usage
- Cost tracking is broken for local models
- Routing decisions based on token history may be inaccurate
Problem
When OpenClaw sends requests to Ollama through the Manifest router in streaming mode (the default), token usage is logged as
0|0in the Manifest database. Non-streaming requests log tokens correctly.Evidence
Querying
agent_messagesin the local Manifest DB:All real OpenClaw requests (streaming) show
0|0. The only entry with actual token counts is a manualcurlwith"stream": false.Root cause
Ollama's OpenAI-compatible streaming endpoint includes token usage only in the final SSE chunk (with
"finish_reason": "stop"). Example final chunk:{"usage": {"prompt_tokens": 31, "completion_tokens": 10, "total_tokens": 41}}Manifest's proxy likely logs token counts before processing the final chunk of the stream, or doesn't parse the
usagefield from the last SSE event.Verification
Direct Ollama calls (non-streaming) return tokens correctly at all levels:
Expected behavior
Manifest should capture the
usagefield from the final SSE chunk of a streaming response and log it inagent_messages.input_tokens/output_tokens.Impact