feat: add AgentRouter setup + improve /health endpoint#270
Open
Mayanks584 wants to merge 1 commit intoAlishahryar1:mainfrom
Open
feat: add AgentRouter setup + improve /health endpoint#270Mayanks584 wants to merge 1 commit intoAlishahryar1:mainfrom
Mayanks584 wants to merge 1 commit intoAlishahryar1:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a richer /health endpoint (including provider/config/runtime diagnostics) and documents an AgentRouter-hosted alternative path for getting started without running the local proxy.
Changes:
- Added
api.health.build_health_payload()and wiredGET /healthto return a detailed runtime/config snapshot (with auth-gated GET and unauthenticatedHEAD /healthfor liveness). - Exposed rate-limiter state via
GlobalRateLimiter.snapshot()and aproviders.registry.provider_rate_limit_snapshot()façade for diagnostics. - Added extensive documentation updates (AgentRouter setup in
README.md, newPROJECT.md) and a comprehensive new test suite for/health.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
api/routes.py |
Updates /health route to return a richer payload and require auth on GET. |
api/health.py |
Implements the rich /health payload builder and helper functions. |
providers/rate_limit.py |
Adds limiter inspection helpers (get_existing_scoped_instance, snapshot). |
providers/registry.py |
Adds a diagnostic façade to fetch per-provider rate-limit snapshots without importing limiter internals from the API layer. |
tests/api/test_health.py |
New pytest coverage validating payload shape, redaction, auth gating, provider snapshot logic, and rate-limit reporting. |
README.md |
Adds AgentRouter hosted setup instructions and TOC link. |
PROJECT.md |
Adds a comprehensive codebase documentation file (but contains some config env var name mismatches vs Settings). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | **Thinking** | `THINKING`, `THINKING_OPUS`, `THINKING_SONNET`, `THINKING_HAIKU` | Enable/disable per tier | | ||
| | **Auth** | `ANTHROPIC_AUTH_TOKEN` | Proxy auth token (optional) | | ||
| | **HTTP** | `HTTP_CONNECT_TIMEOUT`, `HTTP_READ_TIMEOUT`, `HTTP_WRITE_TIMEOUT` | Client timeouts | | ||
| | **Rate limits** | `RATE_LIMIT`, `RATE_WINDOW`, `MAX_CONCURRENCY` | Per-provider rate limiting | |
| | **Rate limits** | `RATE_LIMIT`, `RATE_WINDOW`, `MAX_CONCURRENCY` | Per-provider rate limiting | | ||
| | **Messaging** | `MESSAGING_PLATFORM`, `TELEGRAM_BOT_TOKEN`, `DISCORD_BOT_TOKEN` | Bot config | | ||
| | **Voice** | `WHISPER_DEVICE`, `WHISPER_MODEL` | Transcription backend | | ||
| | **Workspace** | `CLAUDE_WORKSPACE`, `CLAUDE_ALLOWED_DIRS`, `PLANS_DIRECTORY` | Agent workspace | |
Comment on lines
+813
to
+815
| - **Thinking control**: `THINKING=true`, `THINKING_SONNET=false` | ||
| - **Proxy/timeout**: `NVIDIA_NIM_PROXY`, `HTTP_READ_TIMEOUT=120` | ||
| - **Rate limits**: `RATE_LIMIT=40`, `RATE_WINDOW=60`, `MAX_CONCURRENCY=5` |
|
|
||
| > Use `sudo` on Linux/macOS if you hit permission errors. | ||
|
|
||
| ### 2. Get An AgentRouter API Key |
Comment on lines
+29
to
+33
| @pytest.fixture | ||
| def client(app: FastAPI) -> Generator[TestClient]: | ||
| with TestClient(app) as test_client: | ||
| yield test_client | ||
|
|
Comment on lines
+35
to
+40
| @pytest.fixture(autouse=True) | ||
| def _reset_rate_limiters() -> Generator[None]: | ||
| """Clear scoped limiters between tests so snapshots stay deterministic.""" | ||
| GlobalRateLimiter.reset_instance() | ||
| yield | ||
| GlobalRateLimiter.reset_instance() |
| | **Provider keys** | `NVIDIA_NIM_API_KEY`, `OPENROUTER_API_KEY`, `DEEPSEEK_API_KEY` | API credentials | | ||
| | **Local providers** | `LM_STUDIO_BASE_URL`, `LLAMACPP_BASE_URL`, `OLLAMA_BASE_URL` | Base URLs | | ||
| | **Model routing** | `MODEL`, `MODEL_OPUS`, `MODEL_SONNET`, `MODEL_HAIKU` | Provider-prefixed model refs | | ||
| | **Thinking** | `THINKING`, `THINKING_OPUS`, `THINKING_SONNET`, `THINKING_HAIKU` | Enable/disable per tier | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AgentRouter setup added to README
Added a new section in the README explaining how to use AgentRouter as an alternative to running the local proxy.
Why:
From experience, setting up the full local proxy (installing uv, cloning the repo, configuring .env, etc.) can feel like overkill for users who just want to get started quickly. AgentRouter makes that much easier by removing most of that setup.
What’s added:
A simple, step-by-step guide (login → generate API key → set env vars → run)
Examples for different environments (Windows, Linux, macOS)
A short note on when it makes sense to use AgentRouter vs the local proxy
Improved /health endpoint
Updated the /health endpoint to return more useful information instead of just a basic "healthy" response.
Why:
Previously, /health only confirmed that the server was running, but it didn’t really help when something was misconfigured. In most cases, debugging meant checking logs or going through the code, which wasn’t ideal.
What’s changed:
Now returns a detailed snapshot of the current configuration (models, providers, server info, etc.)
Shows whether providers are properly configured and if credentials are present
Includes rate-limit state (when available)
Gives visibility into messaging and web tools status
Security:
Sensitive data like API keys or tokens are not exposed — only safe indicators (like booleans) are returned.