Skip to content

Feat/disable thinking#156

Open
ai-jiayuan wants to merge 10 commits intomemodb-io:devfrom
ai-jiayuan:feat/disable-thinking
Open

Feat/disable thinking#156
ai-jiayuan wants to merge 10 commits intomemodb-io:devfrom
ai-jiayuan:feat/disable-thinking

Conversation

@ai-jiayuan
Copy link
Copy Markdown

Summary

  • Add thinking_enabled config option (Optional[bool]) to control LLM thinking mode via config.yaml
  • Tri-state design: null = no intervention (API default), true = enable, false = disable
  • Support for Doubao (chat.completions only) and OpenAI-compatible providers (DashScope/Qwen via extra_body.enable_thinking)

Changes

  • env.py: Add thinking_enabled: Optional[bool] = None config field
  • llms/__init__.py: Pass CONFIG.thinking_enabled to LLM factory calls
  • doubao_cache_llm.py: Inject thinking param only for chat.completions, skip context.completions
  • openai_model_llm.py: Inject extra_body.enable_thinking for DashScope/Qwen

Test plan

  • Verify thinking_enabled: null — API uses default behavior
  • Verify thinking_enabled: true — thinking enabled
  • Verify thinking_enabled: false — thinking disabled
  • Verify Doubao context.completions path does not receive thinking param

dongying and others added 8 commits April 13, 2026 22:59
- Uncomment thinking control code in doubao_cache_llm.py
- Add thinking_enabled config field (default: false) in env.py
- Pass CONFIG.thinking_enabled through llm_complete() call chain
- Add thinking_enable param to openai_model_llm.py to prevent errors

Usage: set `thinking_enabled: true/false` in config.yaml
or env var `MEMOBASE_THINKING_ENABLED=true/false`

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…tions

context.completions.create is a separate Volcengine API that may not
support the thinking parameter. Passing it could cause API errors on
the primary cached-context path which Memobase uses most frequently.

Split thinking_kwargs out so it only applies to chat.completions.create
calls, leaving context.completions.create unaffected.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…Scope)

For DashScope/Qwen3 models, disable thinking via
extra_body={"enable_thinking": False}, which is the DashScope
native format for controlling deep thinking mode.

This ensures thinking can be disabled regardless of llm_style:
- doubao_cache: kwargs["thinking"] = {"type": "disabled"}
- openai: extra_body={"enable_thinking": False}

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- None (default): don't interfere, let API use its own default
- True: explicitly enable thinking
- False: explicitly disable thinking

This is model-agnostic: works for any provider without hardcoding
model names. Only injects thinking params when user explicitly
configures thinking_enabled in config.yaml.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Both doubao_cache_complete() and openai_complete() now default
thinking_enable=None, matching CONFIG.thinking_enabled=None.
Prevents inconsistent behavior if functions are called directly
without the thinking_enable argument.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant