Skip to content

feat: make qwen3.6-plus-free default model, add compaction models cascade#234

Merged
konard merged 11 commits into
mainfrom
issue-232-36180d84d898
Apr 7, 2026
Merged

feat: make qwen3.6-plus-free default model, add compaction models cascade#234
konard merged 11 commits into
mainfrom
issue-232-36180d84d898

Conversation

@konard

@konard konard commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #232

  • Change default model from minimax-m2.5-free to qwen3.6-plus-free (~1M context window, 5x larger)
  • Add --compaction-models CLI option accepting a links notation references sequence for cascading compaction models
  • Default cascade: (big-pickle nemotron-3-super-free minimax-m2.5-free gpt-5-nano qwen3.6-plus-free same) — ordered from smallest/cheapest to largest context
  • Add new free models (qwen3.6-plus-free, nemotron-3-super-free) to provider priority lists and documentation
  • Case study with context window data and design rationale at docs/case-studies/issue-232/

How the cascade works

During compaction, the system tries each model in order:

  1. Skip models whose context limit is smaller than the current used tokens
  2. If a model succeeds, compaction is complete
  3. If a model fails (rate limit, error), try the next model
  4. same as the final entry falls back to the base model

Files changed

Area Files
Core defaults.ts, argv.ts, run-options.js, model-config.js
Session compaction.ts, prompt.ts, message-v2.ts
Provider provider.ts (priority lists)
Docs FREE_MODELS.md, MODELS.md, README.md, docs/case-studies/issue-232/
Tests compaction-model.test.ts
Release .changeset/update-free-models-232.md

Test plan

  • All 23 compaction model tests pass (including new tests for cascade, defaults, argv)
  • Full test suite: 352 pass, 13 fail (pre-existing MCP timeout test failures, unrelated)
  • ESLint/Prettier checks pass on changed JS files
  • Backward compatibility: --compaction-model (singular) still works
  • CompactionModelConfig without compactionModels field works (backward compat)

Adding .gitkeep for PR creation (default mode).
This file will be removed when the task is complete.

Issue: #232
@konard konard self-assigned this Apr 7, 2026
konard and others added 7 commits April 7, 2026 15:20
…ascade (#232)

- Change DEFAULT_MODEL from minimax-m2.5-free to qwen3.6-plus-free
- Add --compaction-models CLI option accepting links notation sequence
- Implement compaction model cascade: tries models from smallest to largest context,
  skipping models whose context is too small or that hit rate limits
- Default cascade: (big-pickle nemotron-3-super-free minimax-m2.5-free gpt-5-nano qwen3.6-plus-free same)
- Update provider priority lists to include new free models
- Add CompactionModelEntry interface for cascade entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add qwen3.6-plus-free and nemotron-3-super-free to free models docs
- Update default model references from minimax-m2.5-free to qwen3.6-plus-free
- Add context window sizes to FREE_MODELS.md table
- Update MODELS.md pricing table and recommendation list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add test for DEFAULT_MODEL being qwen3.6-plus-free
- Add test for DEFAULT_COMPACTION_MODELS links notation sequence
- Add test for getCompactionModelsFromProcessArgv
- Add tests for CompactionModelConfig with cascade support
- Verify backward compatibility without compactionModels array

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document the motivation, solution design, context window data,
and cascade logic for the free models update and compaction cascade.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@konard konard changed the title [WIP] Update free models list in code and logs, and make qwen3.6-plus-free the default model instead of minimax-m2.5-free feat: make qwen3.6-plus-free default model, add compaction models cascade Apr 7, 2026
@konard konard marked this pull request as ready for review April 7, 2026 15:28
konard and others added 2 commits April 7, 2026 15:30
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@konard

konard commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

  • Public pricing estimate: $8.861717
  • Calculated by Anthropic: $8.911717 USD
  • Difference: $0.050000 (+0.56%)

📊 Context and tokens usage:

Claude Opus 4.6:

  • Context window: 140.3K / 1M input tokens (14%), 36.5K / 128K output tokens (28%)

Total: 142.0K + 13.4M cached input tokens, 36.5K output tokens, $8.495370 cost

Claude Haiku 4.5:

  • Context window: 1.3M / 200K input tokens (634%), 10.1K / 64K output tokens (16%)

Total: 173.0K + 1.1M cached input tokens, 10.1K output tokens, $0.366347 cost

🤖 Models used:

  • Tool: Anthropic Claude Code
  • Requested: opus
  • Main model: Claude Opus 4.6 (claude-opus-4-6)
  • Additional models:
    • Claude Haiku 4.5 (claude-haiku-4-5-20251001)

📎 Log file uploaded as Gist (3152KB)


Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard

konard commented Apr 7, 2026

Copy link
Copy Markdown
Contributor Author

✅ Ready to merge

This pull request is now ready to be merged:

  • All CI checks have passed
  • No merge conflicts
  • No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update free models list in code and logs, and make qwen3.6-plus-free the default model instead of minimax-m2.5-free

1 participant