fix: enable token usage tracking and configurable stream timeout for Ollama provider by kasjens · Pull Request #8493 · aaif-goose/goose

kasjens · 2026-04-12T07:04:01Z

Summary

Three related fixes for the Ollama provider:

Token usage tracking: The provider was unconditionally stripping stream_options: {"include_usage": true} from requests (fix: prevent Ollama provider from hanging on tool-calling requests #7723), preventing Ollama from returning token counts in streaming responses. This is now gated behind OLLAMA_STREAM_USAGE (default: true) so modern Ollama builds get usage tracking while older builds can opt out with OLLAMA_STREAM_USAGE=false. Invalid values are handled safely — a warning is logged and stream_options is disabled.
Fallback usage parsing: Added fallback for Ollama-native token fields (prompt_eval_count, eval_count) in get_usage(), so token tracking works even when Ollama doesn't translate to standard OpenAI field names (prompt_tokens, completion_tokens). OpenAI fields take precedence when both are present. Null OpenAI fields (e.g. "completion_tokens": null) correctly fall through to the Ollama-native fields instead of silently dropping usage.
Configurable stream timeout: The hardcoded 30s per-chunk timeout was too aggressive for slower models (CPU inference, large parameter counts, complex reasoning). The timeout is now configurable via a resolution chain: OLLAMA_STREAM_TIMEOUT > GOOSE_STREAM_TIMEOUT > OLLAMA_TIMEOUT > default (120s). Zero values are treated as invalid and skipped to prevent immediate stall errors on every chunk.

Testing

All 29 related tests pass (zero failures)
6 timeout resolution tests: verify the fallback chain, default behavior, and zero-value rejection
3 usage parsing tests: verify Ollama-native field fallback, OpenAI field precedence, and null OpenAI field fallthrough
2 stream_options gating tests: verify OLLAMA_STREAM_USAGE default-on and opt-out behavior

Related Issues

Relates to #8479
Relates to #8476
Relates to #7723

…e tracking in streaming responses

…rue) to enable token usage tracking while allowing older Ollama builds to opt out

…ng stream_options instead of silently defaulting to enabled

…back usage parsing for Ollama-native token fields (prompt_eval_count/eval_count)

…ive token counters (prompt_eval_count/eval_count)

… stall errors on every chunk

jamadeo

Nice one, thank you @kasjens . Is it possible we can avoid the config parameter by looking at the ollama version?

jamadeo · 2026-04-13T17:11:49Z

crates/goose/src/providers/ollama.rs

    input_limit.or(model_config.context_limit)
 }

+fn resolve_ollama_stream_usage() -> bool {


Can we do without a config flag? Maybe https://docs.ollama.com/api-reference/get-version and flag this on a minimum version?

Thanks @jamadeo! I considered that but went with the config flag because:

Ollama isn't always direct — users behind proxies or compatible API servers (LiteLLM, LocalAI) may not expose /api/version.

Version ≠ capability — custom builds/forks may support stream_options without a recognizable version string.

It defaults to enabled — so modern installs work out of the box. Only users on older builds need to set OLLAMA_STREAM_USAGE=false.

Happy to add version detection as a best-effort first pass with the config flag as fallback if you'd prefer that approach!

Makes sense, and seems like most of the time you'd never add it at all

michaelneale

I think this is worth having in, thanks!

kasjens added 7 commits April 11, 2026 17:09

fix: preserve stream_options for Ollama provider to enable token usag…

5758d5f

…e tracking in streaming responses

fix: gate stream_options behind OLLAMA_STREAM_USAGE config (default t…

6ee7566

…rue) to enable token usage tracking while allowing older Ollama builds to opt out

fix: handle invalid OLLAMA_STREAM_USAGE values by warning and disabli…

9b651db

…ng stream_options instead of silently defaulting to enabled

fix: add configurable stream timeout (OLLAMA_STREAM_TIMEOUT) and fall…

233a015

…back usage parsing for Ollama-native token fields (prompt_eval_count/eval_count)

fix: format assert_eq in chunk timeout test to satisfy cargo fmt

d1aef53

fix: handle null OpenAI usage fields by falling through to Ollama-nat…

ccc610d

…ive token counters (prompt_eval_count/eval_count)

fix: reject zero-valued stream timeout overrides to prevent immediate…

8cb9064

… stall errors on every chunk

This was referenced Apr 12, 2026

Ollama provider: context window shows 0% / token counts are always null #8479

Open

There appears to be a regression in Goose when using an Ollama backend. #8476

Open

fix: prevent Ollama provider from hanging on tool-calling requests #7723

Merged

AiRC-ai mentioned this pull request Apr 12, 2026

Delegated agents treat inherit as a literal provider/model instead of inheriting parent session config #8496

Open

jamadeo reviewed Apr 13, 2026

View reviewed changes

jamadeo assigned michaelneale Apr 13, 2026

michaelneale approved these changes Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: enable token usage tracking and configurable stream timeout for Ollama provider#8493

fix: enable token usage tracking and configurable stream timeout for Ollama provider#8493
kasjens wants to merge 7 commits intoaaif-goose:mainfrom
kasjens:main

kasjens commented Apr 12, 2026

Uh oh!

jamadeo left a comment

Uh oh!

jamadeo Apr 13, 2026

Uh oh!

kasjens Apr 14, 2026

Uh oh!

jamadeo Apr 14, 2026

Uh oh!

michaelneale left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kasjens commented Apr 12, 2026

Summary

Testing

Related Issues

Uh oh!

jamadeo left a comment

Choose a reason for hiding this comment

Uh oh!

jamadeo Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

kasjens Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

jamadeo Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants