Skip to content

feat: full-stack model routing with tier auto-assignment#750

Closed
SebConejo wants to merge 33 commits intomainfrom
feat/routing-ui
Closed

feat: full-stack model routing with tier auto-assignment#750
SebConejo wants to merge 33 commits intomainfrom
feat/routing-ui

Conversation

@SebConejo
Copy link
Copy Markdown
Member

Summary

  • Routing module: Complete backend with controller, service, and tier auto-assignment scoring algorithm (cost-based with 1.5x context bonus for Complex, 3x reasoning bonus for Reasoning tier)
  • Database: 2 new migrations adding user_providers and tier_assignments tables, plus context_window/capability_reasoning/capability_code columns to model_pricing
  • Override invalidation: When a provider is disconnected or a model is removed from the pricing catalog, affected tier overrides are automatically cleared with notification messages
  • Frontend: New Routing page with 4 tier cards (auto/override/empty modes), model picker modal with provider-grouped search and pricing, Settings LLM Providers tab connected to backend API
  • Lazy user init: Tier assignments are created on first access to the routing page, with immediate auto-calculation if providers are connected

Test plan

  • 400 unit tests passing (40 suites)
  • 60 e2e tests passing (9 suites) including full routing lifecycle
  • Manual: Navigate to an agent → Settings → LLM Providers → connect a provider
  • Manual: Navigate to Routing → verify tier cards show auto-assigned models with pricing
  • Manual: Override a tier → verify it persists
  • Manual: Disconnect the provider → verify override clears with toast notification
  • Manual: Reset all to auto → verify all tiers revert

🤖 Generated with Claude Code

Implement a complete routing module allowing users to connect LLM providers,
automatically assign models to complexity tiers (Simple, Standard, Complex,
Reasoning) using a scoring algorithm, and manually override assignments.

Backend:
- Add model_pricing capabilities (context_window, reasoning, code) via migration
- Create user_providers and tier_assignments tables with migrations
- RoutingModule with controller, service, and tier auto-assign service
- Scoring algorithm: cost-based with context (1.5x) and reasoning (3x) bonuses
- Override invalidation notifications on provider disconnect and pricing sync
- Lazy user initialization on first routing page access
- getEffectiveModel runtime helper with provider validation

Frontend:
- Routing page with 4 tier cards (auto/override/empty modes)
- Model picker modal with provider-grouped search and pricing display
- Settings LLM Providers tab connected to backend API
- API functions for providers, tiers, and available models
- Provider SVG icons component

Tests: 400 unit tests (40 suites), 60 e2e tests (9 suites) all passing
…to tag

- Unify auto and manual tier cards to show provider icon, label, price,
  and a green "auto" tag when not manually overridden
- Filter available-models endpoint by user's connected providers so the
  model picker only shows models the user has access to
- Add Edit button next to Reset for manual overrides
- Fix Anthropic logo visibility in dark mode (use currentColor)
- Add prefix fallback for model lookups (alias → canonical name)
- Widen routing card header to prevent description wrapping
- Update description copy for clarity
- Trust any localhost origin in dev mode for random-port /serve
- Add missing model entries (Opus 4.6, dated Sonnet/Haiku variants,
  DeepSeek v3/r1 aliases, Mistral/Codestral short names, Qwen hyphenated IDs)
- Add date-suffix stripping and prefix fallback to getModelLabel so
  canonical pricing IDs resolve to clean names (e.g. Claude Sonnet 4.5)
The model picker now lists every model from the PROVIDERS definition
for each connected provider, not just the subset with pricing data.
Pricing info is shown when available from the API.
Match providers by both id and name when looking up icons and labels,
fixing cases where the DB provider name differs from the frontend
display name (e.g. Moonshot vs Kimi).
Centralizes provider resolution via resolveProviderId() with an alias
map (Google→gemini, Alibaba→qwen, Moonshot→moonshot) so icons, labels,
and connected-provider filtering work for all providers.
- Add shared provider-aliases utility (gemini↔google, qwen↔alibaba)
  used by both available-models endpoint and tier auto-assignment
- Provider icon on tier cards now always shows by falling back to
  PROVIDERS model list when pricing API data is unavailable
Replace hardcoded #FCFAF5 with hsl(var(--muted)) so the provider icon
background adapts to dark mode.
The old scoring used 1/price as base, making the cheapest model win
every tier. The new scoring uses quality tiers for complex/reasoning
so capabilities (reasoning, code, large context) are the primary
factor and cost only breaks ties among equally capable models.
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 97.91667% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.79%. Comparing base (c973b31) to head (cd4db6c).
⚠️ Report is 131 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #750      +/-   ##
==========================================
+ Coverage   90.12%   92.79%   +2.67%     
==========================================
  Files         135      100      -35     
  Lines        4781     2902    -1879     
  Branches     1294      538     -756     
==========================================
- Hits         4309     2693    -1616     
+ Misses        383      115     -268     
- Partials       89       94       +5     
Flag Coverage Δ
backend 93.39% <97.91%> (+0.65%) ⬆️
frontend ?
plugin 88.30% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ment

Add a quality_score column (1-5) to model_pricing, representing a
manual quality judgment per model (5=frontier, 1=ultra-low-cost).

Replace formula-based scoring with deterministic per-tier strategies:
- SIMPLE: cheapest model wins
- STANDARD: cheapest among quality >= 2 (excludes ultra-low-cost)
- COMPLEX: highest quality wins, price as tiebreaker
- REASONING: highest quality among reasoning-capable models,
  falls back to COMPLEX logic if none available

This ensures different models are assigned to different tiers even
with a single provider connected.
Scans staged files for common API key patterns (Anthropic, OpenAI,
Google, xAI, GitHub, GitLab, AWS, private keys) and blocks the commit
if any are found. Excludes test files and validation/pattern lines.
Place hyphen at end of character classes instead of escaping with
backslash, which causes grep errors on macOS.
Replace 20 repeated character classes with ERE {n,} quantifiers and
switch matching from grep -e (BRE) to grep -E (ERE). This avoids the
"repetition-operator operand invalid" error on macOS grep.
Use literal +++ instead of \+\+\+ in grep -v (BRE mode), since macOS
BSD grep interprets \+ as a repetition operator with no preceding atom.
@SebConejo SebConejo marked this pull request as draft February 21, 2026 17:35
The OpenRouter sync was importing ~178 extra models on every startup,
making the model list unwieldy. Now the sync only updates prices for
existing seeded models and never inserts new ones. The seeder always
upserts to ensure the curated list stays complete.
… list

The model selector was iterating the hardcoded PROVIDERS array (~92 models)
and fuzzy-matching to pricing data via startsWith, causing duplicate Anthropic
entries and missing xAI models. Now uses the available-models API response as
the source of truth, matching exactly what the Model Prices page shows.
…n, and E2E

- Complete routing.service.spec.ts: add tests for getProviders, upsertProvider,
  removeProvider (with override invalidation), setOverride, clearOverride,
  resetAllOverrides, invalidateOverridesForRemovedModels (9 → 27 tests)
- Complete pricing-sync.service.spec.ts: cover undefined data response,
  missing context_length, removed model detection + invalidation (13 → 18 tests)
- Add provider-aliases.spec.ts: cover all alias expansions and ?? [] fallback
- Add entity specs: user-provider.entity, tier-assignment.entity
- Add model-pricing-cache getAll tests (11 → 14 tests)
- Add unit specs: routing.controller, model-prices.controller,
  model-prices.service, session.guard, notification-email.service
- Add E2E specs: agents CRUD, costs, model-prices, validation
  (injection, XSS, DTO validation, auth edge cases)
- Fix provider-aliases: add xai self-alias for consistent expansion

Backend unit: 48 suites, 484 tests | E2E: 13 suites, 136 tests | All green
@SebConejo SebConejo marked this pull request as ready for review February 21, 2026 23:12
# Conflicts:
#	package-lock.json
#	packages/backend/src/auth/session.guard.spec.ts
#	packages/backend/src/model-prices/model-prices.controller.spec.ts
#	packages/backend/src/model-prices/model-prices.service.spec.ts
#	packages/backend/src/notifications/services/notification-email.service.spec.ts
#	packages/backend/src/notifications/services/notification-rules.service.spec.ts
#	packages/backend/src/notifications/services/notification-rules.service.ts
- Use timestampType()/timestampDefault() in UserProvider and TierAssignment
  entities instead of hardcoded 'timestamp' type (unsupported by SQLite)
- Use portableSql() in E2E specs to convert $N params to ? for SQLite
- Convert boolean params to 0/1 for SQLite in E2E seed queries
- Seed agent_messages directly in costs E2E to avoid timestamp format
  inconsistency between telemetry storage and query comparison
@SebConejo SebConejo marked this pull request as draft February 23, 2026 04:56
…er packaging

- Add routing enable/disable toggle with active/inactive status indicator
- Add preset selector (Eco, Balanced, Quality, Fast, Custom) with bulk save
- Add model picker modal with provider grouping and search
- Improve model pricing with history tracking, name normalization, and unresolved model tracking
- Add product telemetry utility and pricing sync improvements
- Add 79+ frontend tests covering routing, setup wizard, auth, and page components
- Fix Routing test mocks (add missing getPresets/bulkSaveTiers)
- Add OTLP roundtrip and routing E2E tests
- Package manifest-server with LICENSE, README, smoke test, and copy-assets improvements
- Add openclaw-plugin postinstall script, skills, and product telemetry
- Update CI with SQLite cross-platform matrix and server smoke tests
Add overrides to pin @types/node to ^22.0.0 and regenerate lockfile.
- Add explicit types to Better Auth callback parameters (auth.instance.ts)
- Fix undefined not assignable to string|null in Routing.tsx
- Regenerate package-lock.json to fix @vitest/utils resolution
Backend e2e: clear model_pricing before test-specific seeds to avoid
conflicts with rows already inserted by the shared test helper.

Frontend: click the Integration tab before asserting on OTLP key,
Rotate key, and setup step elements that moved from the default
General tab.
# Conflicts:
#	package-lock.json
#	packages/backend/src/auth/auth.instance.ts
#	packages/backend/src/notifications/services/notification-email.service.spec.ts
#	packages/backend/test/costs.e2e-spec.ts
#	packages/backend/test/model-prices.e2e-spec.ts
#	packages/frontend/src/services/api.ts
@SebConejo SebConejo marked this pull request as ready for review February 23, 2026 23:51
@brunobuddy brunobuddy closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants