docs/routing.mdx at main · mnfst/docs

title	Routing
description	How Manifest picks the cheapest model that can handle your query
icon	split

What is routing?

Instead of sending every request to the same expensive model, Manifest scores each query and routes it to the cheapest model that can handle it. Each request is classified along two axes:

Complexity tier — how hard the task is. Four tiers: simple, standard, complex, reasoning.
Specificity — what kind of task it is (coding, web browsing, data analysis, image generation, and so on). Nine categories in total.

Scoring happens in under 2 ms with zero external calls.

The four tiers

Greetings, definitions, short factual questions. Routed to the cheapest model. General coding help, moderate questions. Good quality at low cost. Multi-step tasks, large context, code generation. Best quality models. Formal logic, proofs, math, multi-constraint problems. Reasoning-capable models only.

Specificities

On top of the complexity tier, Manifest detects what kind of task the request is about. Each category can be routed to a model picked specifically for it.

Category	Covers
Coding	Write, debug, and refactor code
Web browsing	Navigate pages, search, and extract content
Data analysis	Crunch numbers, run stats, build charts
Image generation	Create and edit images, logos, visuals
Video generation	Produce clips, animations, and edits
Social media	Draft posts, plan content, track engagement
Email	Compose, reply, and manage your inbox
Calendar	Book meetings, check availability, reschedule
Trading	Analyze markets, place trades, track positions

How detection works

Two signals feed the detector:

Keyword dimensions. The same trie scan that feeds the complexity score also counts matches in category-specific dimensions (e.g. codeGeneration and technicalTerms count toward Coding, webBrowsing toward Web browsing, emailManagement toward Email).
Tool names. When a request includes tool definitions, names with known prefixes boost the matching category — browser_* / playwright_* → Web browsing, gmail_* / outlook_* → Email, gcal_* / calendly_* → Calendar, and so on.

If a category crosses the match threshold, it wins. Otherwise no specificity is assigned and the request routes on complexity alone.

Overrides per category

In the dashboard Routing page, each specificity can be toggled on or off per agent. When active, you can:

Pin a model to the category. All matching requests go to it, regardless of complexity tier.
Set fallbacks. A fallback list specific to that category, tried in order if the primary model fails.

Explicit override (header)

Clients can skip detection by sending an x-manifest-specificity request header with a category ID (coding, web_browsing, data_analysis, image_generation, video_generation, social_media, email_management, calendar_management, or trading). The header value is used directly, with confidence 1.0.

How scoring works

23 dimensions grouped into three categories. The same scoring pipeline feeds both the complexity tier and the specificity detector.

Keyword-based (14) — Scans the prompt for patterns like "prove", "write function", "what is", etc.

Structural (5) — Analyzes token count, nesting depth, code-to-prose ratio, conditional logic, and constraint density.

Contextual (4) — Considers expected output length, repetition requests, tool count, and conversation depth.

Each dimension has a weight. The weighted sum maps to a tier via threshold boundaries. A confidence score (0–1) indicates how clearly the request fits its tier.

Session momentum

Manifest remembers the last 5 tier assignments (30-minute TTL).

Short follow-up messages ("yes", "do it") inherit momentum from the conversation, so they don't drop to a cheaper tier unnecessarily.

Tier overrides

Some signals force a minimum tier regardless of the score:

Signal	Minimum tier
Tools detected	standard
Large context (>50k tokens)	complex
Formal logic keywords	reasoning

Response headers

Every response includes these headers:

Header	Description
`X-Manifest-Tier`	Assigned complexity tier
`X-Manifest-Specificity`	Detected specificity category (only set when one was assigned)
`X-Manifest-Model`	Actual model used
`X-Manifest-Provider`	Provider (anthropic, openai, google, etc.)
`X-Manifest-Confidence`	Scoring confidence (0–1)
`X-Manifest-Reason`	Why this tier was selected

Cloud vs Self-hosted

Routing is performed server-side. Model mappings are managed by the Manifest team and updated regularly. Routing runs on your own machine inside the Self-hosted backend. The model-to-tier mapping is seeded on first boot and can be customized in the dashboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is routing?

The four tiers

Specificities

How detection works

Overrides per category

Explicit override (header)

How scoring works

Session momentum

Tier overrides

Response headers

Cloud vs Self-hosted

FilesExpand file tree

routing.mdx

Latest commit

History

routing.mdx

File metadata and controls

What is routing?

The four tiers

Specificities

How detection works

Overrides per category

Explicit override (header)

How scoring works

Session momentum

Tier overrides

Response headers

Cloud vs Self-hosted