-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Feature Request: Allow configuring reasoning behavior (e.g. disable reasoning) for LLM calls #2612
Description
Describe the feature you'd like
Karakeep currently does not provide a way to control reasoning behavior (e.g. enabling/disabling chain-of-thought / reasoning tokens) when making LLM calls.
This causes issues when using reasoning-capable models (e.g. Qwen3.5 via vLLM), especially in combination with structured output (JSON schema).
Problem
When using Karakeep with a reasoning-capable model:
The model generates reasoning tokens first
Then generates the final answer
However, Karakeep also:
Enforces a max_tokens limit (e.g. 2048)
Uses response_format: json_schema with strict: true
This leads to the following failure mode:
The model consumes all tokens on reasoning
No tokens remain for the final structured output
The response returns:
content: null
finish_reason: "length"
Example (real trace)
prompt_tokens: ~2244
completion_tokens: 2048
content: null
reasoning_content: present
Result: No usable output despite successful call
Root Cause
Karakeep does not expose or allow passing reasoning-related parameters such as:
"reasoning": { "effort": "none" }
Without this, reasoning-enabled models default to generating chain-of-thought, which:
wastes token budget
breaks structured output
increases latency
Proposed Solution
Allow users to configure reasoning behavior.
Describe the benefits this would bring to existing Karakeep users
This feature would significantly improve reliability, performance, and compatibility of Karakeep’s LLM integrations, especially as newer reasoning-capable models become more common.
Can the goal of this request already be achieved via other means?
I use LiteLLM in between, I might control reasoning behaviour there for the karakeep virtual key.
Have you searched for an existing open/closed issue?
- I have searched for existing issues and none cover my fundamental request
Additional context
No response