Context for AI agents (Claude Code, Copilot, Cursor, etc.) working on this codebase. Read this fully before making changes.
Bifrost is a high-performance AI gateway that unifies 20+ LLM providers behind a single OpenAI-compatible API with ~11µs overhead at 5,000 RPS. It also serves as an MCP (Model Context Protocol) gateway, turning static chat models into tool-calling agents.
GitHub: maximhq/bifrost
bifrost/
├── core/ # Go core library — the engine
│ ├── bifrost.go # Main struct, request queuing, provider lifecycle (~3.4K lines)
│ ├── inference.go # Inference routing, fallbacks, streaming dispatch (~1.9K lines)
│ ├── mcp.go # MCP integration entry point
│ ├── schemas/ # ALL shared Go types — 41 files
│ │ ├── bifrost.go # BifrostConfig, ModelProvider enum, RequestType enum, context keys
│ │ ├── provider.go # Provider interface (30+ methods), NetworkConfig, ProviderConfig
│ │ ├── plugin.go # LLMPlugin, MCPPlugin, HTTPTransportPlugin, ObservabilityPlugin
│ │ ├── context.go # BifrostContext (custom context.Context with mutable values)
│ │ ├── chatcompletions.go # Chat completion request/response types
│ │ ├── responses.go # OpenAI Responses API types
│ │ ├── embedding.go # Embedding types
│ │ ├── images.go # Image generation types
│ │ ├── batch.go # Batch operation types
│ │ ├── files.go # File management types
│ │ ├── mcp.go # MCP types
│ │ ├── trace.go # Tracer interface
│ │ └── logger.go # Logger interface
│ ├── providers/ # 20+ provider implementations
│ │ ├── openai/ # Reference implementation (largest, most complete)
│ │ ├── anthropic/ # Non-OpenAI-compatible example
│ │ ├── bedrock/ # AWS event-stream protocol
│ │ ├── gemini/ # Google-specific API shape
│ │ ├── groq/ # OpenAI-compatible (minimal, delegates to openai/)
│ │ └── utils/ # Shared: HTTP client, SSE parsing, error handling, scanner pool
│ ├── pool/ # Generic Pool[T] — dual-mode (prod: sync.Pool, debug: full tracking)
│ │ ├── pool_prod.go # Zero-overhead sync.Pool wrapper (default build)
│ │ └── pool_debug.go # Double-release/use-after-release/leak detection (-tags pooldebug)
│ ├── mcp/ # MCP protocol implementation
│ │ ├── agent.go # Agent orchestration loop (multi-turn tool calling)
│ │ ├── clientmanager.go # MCP client lifecycle management
│ │ ├── toolmanager.go # Tool registration, discovery, filtering
│ │ ├── healthmonitor.go # Client health monitoring
│ │ └── codemode/starlark/ # Starlark sandbox for code-mode execution
│ └── internal/
│ ├── llmtests/ # LLM integration test infra (48 files, scenario-based)
│ └── mcptests/ # MCP/Agent test infra (40+ files, mock-based)
│
├── framework/ # Data persistence, streaming, ecosystem utilities
│ ├── configstore/ # Config storage backends (file, postgres)
│ ├── logstore/ # Log storage backends (file, postgres)
│ ├── vectorstore/ # Vector storage (Weaviate, Qdrant, Redis, Pinecone)
│ ├── streaming/ # Streaming accumulator, delta copying, response marshaling
│ │ ├── accumulator.go # Chunk accumulation into full response (~24KB)
│ │ ├── chat.go # Chat stream handling (~17KB)
│ │ └── responses.go # Response stream marshaling (~35KB)
│ ├── modelcatalog/ # Model metadata registry
│ ├── tracing/ # Distributed tracing helpers
│ └── encrypt/ # Encryption utilities
│
├── transports/
│ ├── config.schema.json # JSON Schema — THE source of truth for config.json (~2700 lines)
│ └── bifrost-http/ # HTTP gateway transport
│ ├── server/ # Server lifecycle, route registration
│ ├── handlers/ # 27 HTTP endpoint handlers
│ │ ├── inference.go # Chat/text completions, responses API (~109KB)
│ │ ├── mcpinference.go # MCP tool execution
│ │ ├── governance.go # Virtual keys, teams, customers, budgets (~100KB)
│ │ ├── providers.go # Provider CRUD, key management
│ │ ├── mcp.go # MCP client registry management
│ │ ├── logging.go # Log queries, stats, histograms
│ │ ├── config.go # System configuration
│ │ ├── plugins.go # Plugin CRUD
│ │ ├── cache.go # Cache management
│ │ ├── session.go # Auth/session management
│ │ ├── health.go # Health checks
│ │ ├── mcpserver.go # MCP server (SSE/streamable HTTP)
│ │ ├── websocket.go # WebSocket handler
│ │ ├── devpprof.go # Pool debug profiler endpoint (~23KB)
│ │ └── middlewares.go # Middleware definitions
│ ├── lib/ # ChainMiddlewares, config, context conversion
│ └── integrations/ # SDK compatibility layers
│ ├── openai.go # OpenAI SDK drop-in compatibility
│ ├── anthropic.go # Anthropic SDK compatibility
│ ├── bedrock.go # AWS Bedrock SDK compatibility
│ ├── genai.go # Google GenAI SDK compatibility
│ ├── langchain.go # LangChain compatibility
│ ├── litellm.go # LiteLLM compatibility
│ └── pydanticai.go # PydanticAI compatibility
│
├── plugins/ # Go plugins — each has own go.mod
│ ├── governance/ # Budget, rate limiting, virtual keys, routing, RBAC
│ ├── telemetry/ # Prometheus metrics, push gateway
│ ├── logging/ # Request/response audit logging
│ ├── semanticcache/ # Semantic response caching via vector store
│ ├── otel/ # OpenTelemetry tracing
│ ├── mocker/ # Mock responses for testing
│ ├── jsonparser/ # JSON extraction utilities
│ ├── maxim/ # Maxim observability
│ └── compat/ # LiteLLM SDK compatibility (HTTP transport)
│
├── ui/ # React + vite web interface
│ ├── app/workspace/ # Feature pages (20+ workspace sections)
│ ├── components/ # Shared React components
│ └── lib/ # Constants, utilities, types
│
├── tests/e2e/ # Playwright E2E tests
│ ├── core/ # Fixtures, page objects, helpers, API actions
│ └── features/ # Per-feature test suites
│
├── docs/ # Mintlify MDX documentation
│ ├── docs.json # Navigation config
│ ├── media/ # Screenshots (ui-*.png naming convention)
│ └── (architecture|features|providers|mcp|plugins|enterprise|...)
│
├── .claude/skills/ # Claude Code skill definitions (4 skills)
├── go.work # Go workspace — requires Go 1.26.1
├── Makefile # Build, test, dev commands (1300+ lines)
└── terraform/ # Infrastructure as Code
Bifrost is a multi-module Go workspace. Each module has its own go.mod:
go.work
├── core/go.mod # github.com/maximhq/bifrost/core
├── framework/go.mod # github.com/maximhq/bifrost/framework
├── transports/go.mod # github.com/maximhq/bifrost/transports
└── plugins/*/go.mod # 9 plugin modules (governance, telemetry, logging, etc.)
Rules:
- Run
go mod tidyin the specific module directory, not the root - Cross-module imports resolve via workspace locally, but need explicit
requireingo.modfor releases - The workspace requires Go 1.26.1 (
go.workdirective)
# Development
make dev # Full local dev (UI + API with hot reload via air)
make build # Build bifrost-http binary
# Core tests (provider integration tests — hit live APIs)
make test-core # All providers
make test-core PROVIDER=openai # Specific provider
make test-core PROVIDER=openai TESTCASE=TestSimpleChat # Specific test
make test-core PATTERN=TestStreaming # Tests matching pattern
make test-core DEBUG=1 # With Delve debugger on :2345
# MCP/Agent tests (mock-based, no live APIs)
make test-mcp # All MCP tests
make test-mcp TESTCASE=TestAgentLoop # Specific test
make test-mcp TYPE=agent # By category (agent|tool|connection|codemode)
# Plugin tests
make test-plugins # All plugins
make test-governance # Governance plugin specifically
# Integration tests (SDK compatibility)
make test-integrations-py # Python SDK tests
make test-integrations-ts # TypeScript SDK tests
# E2E tests (Playwright, requires running dev server)
make run-e2e # All E2E tests
make run-e2e FLOW=providers # Specific feature
# Code quality
make lint # Linting
make fmt # Format codeClient HTTP Request
→ FastHTTP Transport (parsing, validation ~2µs)
→ SDK Integration Layer (OpenAI/Anthropic/Bedrock format → Bifrost format)
→ Middleware Chain (lib.ChainMiddlewares, applied per-route)
→ HTTPTransportPreHook (HTTP-level plugins, can short-circuit)
→ PreLLMHook Pipeline (auth, rate-limit, cache check — registration order)
→ MCP Tool Discovery & Injection (if tool_choice present)
→ Provider Queue (channel-based, per-provider isolation)
→ Worker picks up request
→ Key Selection (~10ns weighted random)
→ Provider API Call (fasthttp client, connection pooling)
→ Response / SSE Stream
→ PostLLMHook Pipeline (reverse order of PreLLMHooks)
→ Tool Execution Loop (if tool_calls in response, MCP agent loop)
→ HTTPTransportPostHook (reverse order)
→ Response Serialization
→ HTTP Response to Client
- Provider isolation: Each provider has its own worker pool and queue. One provider going down doesn't cascade to others.
- Channel-based async: Request routing uses Go channels (
chan *ChannelMessage), not mutexes. TheProviderQueuestruct manages channel lifecycle with atomic flags. - Object pooling everywhere:
sync.Poolwrappers reduce GC pressure. Pools exist for: channel messages, response channels, error channels, stream channels, plugin pipelines, MCP requests, HTTP request/response objects, scanner buffers. - Plugin pipeline symmetry: Pre-hooks execute in registration order, post-hooks in reverse order (LIFO). For every pre-hook executed, the corresponding post-hook is guaranteed to run.
- Streaming: SSE chunks flow through
chan chan *schemas.BifrostStreamChunk. Accumulated into full response for post-hooks viaframework/streaming/accumulator.go.
BifrostContext (core/schemas/context.go) is a custom context.Context with thread-safe mutable values. Unlike standard Go contexts, values can be set after creation:
ctx := schemas.NewBifrostContext(parent, deadline)
ctx.SetValue(key, value) // Thread-safe, uses RWMutex
ctx.WithValue(key, value) // Chainable variantReserved context keys (set by Bifrost internals — DO NOT set manually):
BifrostContextKeySelectedKeyID/Name— Set by governance pluginBifrostContextKeyGovernance*— Set by governance pluginBifrostContextKeyNumberOfRetries,BifrostContextKeyFallbackIndex— Set by retry/fallback logicBifrostContextKeyStreamEndIndicator— Set by streaming infrastructureBifrostContextKeyTrace*,BifrostContextKeySpan*— Set by tracing middleware
User-settable keys (plugins and handlers can set these):
BifrostContextKeyVirtualKey(x-bf-vk) — Virtual key for governanceBifrostContextKeyAPIKeyName(x-bf-api-key) — Explicit key selection by nameBifrostContextKeyAPIKeyID(x-bf-api-key-id) — Explicit key selection by ID (takes priority over name)BifrostContextKeyRequestID— Request IDBifrostContextKeyExtraHeaders— Extra headers to forward to providerBifrostContextKeyURLPath— Custom URL path for providerBifrostContextKeySkipKeySelection— Skip key selection (pass empty key)BifrostContextKeyUseRawRequestBody— Send raw body directly to provider
Gotcha: BlockRestrictedWrites() silently drops writes to reserved keys. This prevents plugins from accidentally overwriting internal state.
There are two categories of providers:
Category 1: Non-OpenAI-compatible (Anthropic, Bedrock, Gemini, Cohere, HuggingFace, Replicate, ElevenLabs):
core/providers/<name>/
├── <name>.go # Controller: constructor, interface methods, HTTP orchestration
├── <name>_test.go # Tests
├── types.go # ALL provider-specific structs (PascalCase prefixed with provider name)
├── utils.go # Constants, base URLs, helpers (camelCase for unexported)
├── errors.go # Error parsing: provider HTTP error → *schemas.BifrostError
├── chat.go # Chat request/response converters
├── embedding.go # Embedding converters (if supported)
├── images.go # Image generation (if supported)
├── speech.go # TTS/STT (if supported)
└── responses.go # Responses API + streaming converters
Category 2: OpenAI-compatible (Groq, Cerebras, Ollama, Perplexity, OpenRouter, Parasail, Nebius, xAI, SGL):
core/providers/<name>/
├── <name>.go # Minimal — constructor + delegates to openai.HandleOpenAI* functions
└── <name>_test.go # Tests
Converter function naming convention:
To<ProviderName><Feature>Request()— Bifrost schema → Provider API formatToBifrost<Feature>Response()— Provider API format → Bifrost schema- These must be pure transformation functions — no HTTP calls, no logging, no side effects
Provider constructor pattern:
func NewProvider(config schemas.ProviderConfig) (*Provider, error) {
// Validate config, set up fasthttp.Client with connection pooling
client := &fasthttp.Client{
MaxConnsPerHost: config.NetworkConfig.MaxConnsPerHost, // configurable, default 5000
MaxIdleConnDuration: 30 * time.Second,
}
// After ConfigureProxy/ConfigureDialer/ConfigureTLS, build a sibling client
// for streaming. BuildStreamingClient zeros ReadTimeout/WriteTimeout/MaxConnDuration
// so streams aren't killed by fasthttp's whole-response deadline; per-chunk idle
// is enforced at the app layer via NewIdleTimeoutReader.
streamingClient := providerUtils.BuildStreamingClient(client)
return &Provider{client: client, streamingClient: streamingClient, ...}, nil
}Streaming vs unary client: Every provider holds two clients — client for unary requests (ReadTimeout=30s bounds the whole response) and streamingClient for SSE / EventStream / chunked paths (ReadTimeout=0; the per-chunk NewIdleTimeoutReader is the only governor). Pass provider.streamingClient to every Handle*Streaming / Handle*StreamRequest helper and to direct Do calls inside *Stream methods. For new providers, apply the same pattern — missing the switch means streams get killed at 30s.
Note: Bedrock uses net/http (not fasthttp) with HTTP/2 support. Its http.Transport is configured with ForceAttemptHTTP2: true and MaxConnsPerHost from NetworkConfig to allow multiple HTTP/2 connections when the server's per-connection stream limit (100 for AWS Bedrock) is reached. Use providerUtils.BuildStreamingHTTPClient(client) to derive the streaming variant — it shares the base Transport (safe for concurrent reuse) but clears Client.Timeout.
core/schemas/provider.go defines the Provider interface with 30+ methods. Every provider must implement all of them (returning "not supported" for unsupported operations). The interface covers:
ListModels,ChatCompletion,ChatCompletionStreamResponses,ResponsesStream(OpenAI Responses API)TextCompletion,TextCompletionStreamEmbedding,Speech,SpeechStream,Transcription,TranscriptionStreamImageGeneration,ImageGenerationStream,ImageEdit,ImageEditStream,ImageVariationCountTokensBatch*(Create, List, Retrieve, Cancel, Results)File*(Upload, List, Retrieve, Delete, Content)Container*andContainerFile*(Create, List, Retrieve, Delete, Content)
Streaming methods receive a PostHookRunner callback and return chan *BifrostStreamChunk:
ChatCompletionStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key, request *BifrostChatRequest) (chan *BifrostStreamChunk, *BifrostError)Each provider has errors.go with an ErrorConverter function:
type ErrorConverter func(resp *fasthttp.Response, requestType schemas.RequestType, providerName schemas.ModelProvider, model string) *schemas.BifrostErrorThe shared utility providerUtils.HandleProviderAPIError() handles common HTTP error parsing. Provider-specific parsers add extra field mapping. Errors always carry metadata:
bifrostErr.ExtraFields.Provider = providerName
bifrostErr.ExtraFields.ModelRequested = model
bifrostErr.ExtraFields.RequestType = requestTypeFour plugin interfaces exist:
| Interface | Hook Methods | When Called |
|---|---|---|
LLMPlugin |
PreLLMHook, PostLLMHook |
Every LLM request (SDK + HTTP) |
MCPPlugin |
PreMCPHook, PostMCPHook |
Every MCP tool execution |
HTTPTransportPlugin |
HTTPTransportPreHook, HTTPTransportPostHook, HTTPTransportStreamChunkHook |
HTTP gateway only (not Go SDK) |
ObservabilityPlugin |
Inject(ctx, trace) |
Async, after response written to wire |
Key plugin behaviors:
- Plugin errors are logged as warnings, never returned to the caller
- Pre-hooks can short-circuit by returning
*LLMPluginShortCircuit(cache hit, auth failure, rate limit) - Post-hooks receive both response and error — either can be nil. Plugins can recover from errors (set error to nil, provide response) or invalidate responses (set response to nil, provide error)
BifrostError.AllowFallbackscontrols whether fallback providers are tried:nilor&true= allow,&false= blockHTTPTransportStreamChunkHookis called per-chunk during streaming — can modify, skip, or abort the stream
core/pool/ provides Pool[T] with two build modes:
// Production (default): zero-overhead sync.Pool wrapper
// Debug (-tags pooldebug): tracks double-release, use-after-release, leaks with stack traces
p := pool.New[MyType]("descriptive-name", func() *MyType { return &MyType{} })
obj := p.Get()
// ... use obj ...
// MUST reset ALL fields before Put — pool does not auto-reset
p.Put(obj)Acquire/Release pattern for types with complex reset logic (used in schemas/plugin.go):
req := schemas.AcquireHTTPRequest() // Get from pool, pre-allocated maps
defer schemas.ReleaseHTTPRequest(req) // Clears all maps and fields, returns to poolHandler pattern: Handlers are structs with injected dependencies:
type CompletionHandler struct {
client *bifrost.Bifrost
handlerStore lib.HandlerStore
config *lib.Config
}Route registration: Each handler implements RegisterRoutes(router, middlewares...) — routes get middleware chains applied per-route via lib.ChainMiddlewares().
SDK integration layers (transports/bifrost-http/integrations/) provide request/response converters between provider-native SDK formats and Bifrost's internal format. This enables drop-in replacement of OpenAI SDK, Anthropic SDK, AWS Bedrock SDK, Google GenAI SDK, LangChain, and LiteLLM.
Every pooled object must have all fields zeroed before pool.Put(). Stale data leaks between requests. The debug build catches double-release and use-after-release but not missing resets.
// WRONG — stale data from previous request leaks to next user
pool.Put(msg)
// RIGHT
msg.Response = nil
msg.Error = nil
msg.Context = nil
msg.ResponseStream = nil
pool.Put(msg)ProviderQueue uses atomic flags and sync.Once to prevent "send on closed channel" panics:
type ProviderQueue struct {
queue chan *ChannelMessage
done chan struct{}
closing uint32 // atomic: 0=open, 1=closing
signalOnce sync.Once // ensure signal fires only once
closeOnce sync.Once // ensure close fires only once
}Always check the atomic closing flag before sending. Never close a channel without this pattern.
RetryBackoffInitial and RetryBackoffMax are time.Duration (nanoseconds) in Go but milliseconds (integers) in JSON. Custom MarshalJSON/UnmarshalJSON handles conversion. If adding new duration fields to any config struct, follow this pattern exactly.
NetworkConfig.ExtraHeaders is deep-copied in CheckAndSetDefaults() to prevent data races between concurrent requests. Apply the same maps.Copy() pattern to any new map fields in config structs.
Adding a new operation type requires changes across the entire codebase:
- Add method to
Providerinterface incore/schemas/provider.go - Implement in all 20+ providers (most return "not supported")
- Add
RequestTypeconstant incore/schemas/bifrost.go - Add to
AllowedRequestsstruct andIsOperationAllowed()switch - Add handler endpoint in
transports/bifrost-http/handlers/ - Wire up in
core/bifrost.goandcore/inference.go
Groq, Cerebras, Ollama, Perplexity, OpenRouter, Parasail, Nebius, xAI, and SGL all delegate to openai.HandleOpenAI* functions. Any change to OpenAI converter logic affects all of them. Always test broadly: make test-core (all providers).
The SSE scanner buffer pool in core/providers/utils/utils.go starts at 4KB. Buffers grow dynamically but those exceeding 64KB are discarded (not returned to pool) to prevent memory bloat. Be aware when working with providers that send very large SSE events.
Pre-hooks: registration order (first registered → first to run). Post-hooks: reverse order. This creates "wrapping" semantics — the first plugin registered is the outermost wrapper (its pre-hook runs first, post-hook runs last). Changing registration order changes behavior.
When a provider fails and the request falls to a fallback, the entire plugin pipeline re-executes from scratch. Governance checks, caching, and logging all run again for each attempt. Intentional, but surprising when debugging request counts or cost tracking.
A nil *AllowedRequests means "all operations allowed." A non-nil value only allows fields explicitly set to true. This applies to both ProviderConfig.AllowedRequests and CustomProviderConfig.AllowedRequests.
When BlockRestrictedWrites() is active, writes to reserved keys (governance IDs, retry counts, fallback index, etc.) are silently ignored — no error. If your plugin needs to pass data through context, use your own custom key type.
Bifrost uses github.com/valyala/fasthttp for provider HTTP calls. The API is different from net/http:
- Use
fasthttp.AcquireRequest()/fasthttp.ReleaseRequest()for lifecycle fasthttp.Clientpools connections per-host (NetworkConfig.MaxConnsPerHost, default 5000, 30s idle)- Request/response bodies accessed via
resp.Body()(returns[]byte, notio.Reader) - Exception: Bedrock uses
net/http(for AWS SigV4 signing) withhttp.Transportconfigured for HTTP/2 multi-connection support
JSON marshaling in hot paths uses github.com/bytedance/sonic for performance. core/schemas/ uses standard encoding/json for custom marshaling (e.g., NetworkConfig). Don't mix them accidentally.
Bifrost uses atomic.Pointer for providers and plugins lists. On updates: create new slice → atomically swap pointer. Never mutate the slice in place — concurrent readers would see partial state.
Tool access follows: Global filter → Client-level filter → Tool-level filter → Per-request filter (HTTP headers). All four levels must agree for a tool to be available. Changes to filtering logic must respect this hierarchy.
transports/config.schema.json (~2700 lines) is the authoritative definition for all config.json fields. Documentation examples must match. When adding config fields: update schema first → handlers → docs.
E2E tests depend on data-testid attributes. Convention: data-testid="<entity>-<element>-<qualifier>". If you rename or remove one, search tests/e2e/ for references. If you add new interactive elements, add data-testid.
In tests/e2e/core/, never marshal API payloads to a Record/Map/plain-object and then re-serialize. Field ordering matters for backend validation and snapshot comparisons. Construct payloads as object literals with fields in the intended order and pass directly to Playwright's request.post({ data }). Avoid Object.fromEntries(), JSON.parse(JSON.stringify(...)) round-trips, or destructuring into an intermediate Record<string, unknown> — these can silently reorder fields.
- Create
core/providers/<name>/with files per the pattern (see "Provider Implementation" above) - Add
ModelProviderconstant incore/schemas/bifrost.go - Add to
StandardProviderslist incore/schemas/bifrost.go - Register in
core/bifrost.go— add import + case in provider init switch - UI integration (all required):
ui/lib/constants/config.ts— model placeholder + key requirementui/lib/constants/icons.tsx— provider iconui/lib/constants/logs.ts— provider display name (2 places)docs/openapi/openapi.json— OpenAPI spec updatetransports/config.schema.json— config schema (2 locations)
- CI/CD: Add env vars to
.github/workflows/pr-tests.ymlandrelease-pipeline.yml(4 jobs) - Docs: Create
docs/providers/supported-providers/<name>.mdx - Test:
make test-core PROVIDER=<name>
The make test-core target is the canonical harness for provider tests — it wires up env vars from .env (provider API keys), invokes the per-provider {provider}_test.go entrypoint in core/providers/<provider>/, and routes through the shared core/internal/llmtests/ scenario suite that validates end-to-end behavior (including streaming).
Running bare go test ./core/providers/<provider>/... only executes unit tests and skips the llmtests scenarios — so it won't catch regressions in streaming, tool-calling, or provider-specific response shapes.
make test-core PROVIDER=anthropic TESTCASE=TestChatCompletionStream # exact test
make test-core PROVIDER=openai PATTERN=Stream # substring match
make test-core PROVIDER=bedrock # all scenarios for one provider
make test-core DEBUG=1 PROVIDER=gemini TESTCASE=TestResponsesStream # attach Delve on :2345PATTERN and TESTCASE are mutually exclusive. Provider name must match a directory under core/providers/ (e.g. anthropic, openai, bedrock, vertex, azure, gemini, cohere, mistral, groq, etc.).
Scenario-based tests that run against live provider APIs with dual-API testing (Chat Completions + Responses API):
func RunMyScenarioTest(t *testing.T, client *bifrost.Bifrost, ctx context.Context, cfg ComprehensiveTestConfig) {
// Use validation presets: BasicChatExpectations(), ToolCallExpectations(), etc.
// Use retry framework for flaky assertions
}- Register in
tests.gotestScenariosslice - Add
Scenarios.MyScenarioflag toComprehensiveTestConfig - Run:
make test-core PROVIDER=<name> TESTCASE=<TestName>
Mock-based tests with DynamicLLMMocker and declarative setup:
manager, mocker, ctx := SetupAgentTest(t, AgentTestConfig{
InProcessTools: []string{"echo", "calculator"},
AutoExecuteTools: []string{"*"},
MaxDepth: 5,
})
// Queue mock LLM responses, assert tool execution orderCategories: agent_*_test.go, tool_*_test.go, connection_*_test.go, codemode_*_test.go
Run: make test-mcp TESTCASE=<TestName>
Playwright tests with page objects, data factories, fixtures:
- Page objects extend
BasePage, usegetByTestId()as primary selector strategy - Data factories use
Date.now()for unique names (prevents collision in parallel runs) - Track created resources in arrays, clean up in
afterEach - Import
test/expectfrom../../core/fixtures/base.fixture(never from@playwright/test) - Never marshal API payloads to a
Record/Map/plain-object and then re-serialize. Field ordering matters for snapshot comparisons and some backend validations. Construct payloads as object literals with fields in the intended order and pass directly to Playwright'srequest.post({ data }). Do NOT destructure into an intermediateRecord<string, unknown>or useObject.fromEntries()/JSON.parse(JSON.stringify(...))round-trips, as these can reorder fields.
Run: make run-e2e FLOW=<feature>
Four skills are available via /skill-name:
Write, update, or review Mintlify MDX documentation. Researches UI code, Go handlers, and config schema. Validates config.json examples against transports/config.schema.json. Outputs docs with Web UI / API / config.json tabs.
Variants: /docs-writer update <doc-path>, /docs-writer review <doc-path>
Create, run, debug, audit, or auto-update Playwright E2E tests.
Variants:
/e2e-test fix <spec>— Debug and fix a failing test/e2e-test sync— Detect UI changes, update affected tests automatically/e2e-test audit— Scan specs for incorrect/weak assertions (P0-P6 severity scale)
Investigate a GitHub issue from maximhq/bifrost. Fetches issue details, classifies by type/area, searches codebase, traces dependencies, analyzes side effects, suggests tests (LLM/MCP/E2E), and presents an implementation plan with per-change approval gates.
Systematically address unresolved PR review comments. Uses GraphQL to get unresolved threads, presents each with FIX/REPLY/SKIP options, collects fixes locally, and only posts replies after code is pushed to remote.
- Change types in
core/schemas/chatcompletions.go - Update converter functions in each provider's
chat.go - If streaming affected, update
framework/streaming/(accumulator, delta copy) - Run
make test-core(all providers)
- Add to schema type in
core/schemas/ - Map in provider response converter (
ToBifrost*Response) - Handle in streaming accumulator if applicable
- Update HTTP handler if field needs special serialization
- Update
transports/config.schema.jsonif configurable
- Create
plugins/<name>/with its owngo.mod - Implement
LLMPlugin,MCPPlugin, orHTTPTransportPlugininterface - Add to
go.work - Register in transport layer or Bifrost config
- Add test targets to
Makefile
- Find workspace page:
ui/app/workspace/<feature>/ - Check existing
data-testidattributes — E2E tests depend on them - Add
data-testidto new interactive elements - Run
make run-e2e FLOW=<feature>to verify - If E2E tests break, use
/e2e-test syncto update them
| What | Where |
|---|---|
| Main Bifrost struct & queuing | core/bifrost.go |
| Inference routing & fallbacks | core/inference.go |
| Provider interface (30+ methods) | core/schemas/provider.go |
| ModelProvider enum & context keys | core/schemas/bifrost.go |
| Plugin interfaces & pooled HTTP types | core/schemas/plugin.go |
| BifrostContext (mutable context) | core/schemas/context.go |
| Chat completion types | core/schemas/chatcompletions.go |
| Responses API types | core/schemas/responses.go |
| Object pool (prod + debug) | core/pool/pool_prod.go, pool_debug.go |
| Shared provider utils & SSE parsing | core/providers/utils/utils.go |
| Streaming accumulator | framework/streaming/accumulator.go |
| HTTP inference handler | transports/bifrost-http/handlers/inference.go |
| Governance handler | transports/bifrost-http/handlers/governance.go |
| Config schema (source of truth) | transports/config.schema.json |
| Pool debug profiler | transports/bifrost-http/handlers/devpprof.go |
| LLM test infrastructure | core/internal/llmtests/ |
| MCP test infrastructure | core/internal/mcptests/ |
| E2E test infrastructure | tests/e2e/core/ |
| Docs navigation config | docs/docs.json |
| CI/CD workflows | .github/workflows/ |
- Go:
gofmt/goimports. No custom linter config. - TypeScript/React: Oxfmt. TanStack Router.
- JSON tags:
snake_casematching provider API conventions. - Error strings: Lowercase, no trailing punctuation (Go convention).
- Provider types: Prefixed with provider name in PascalCase (
AnthropicChatRequest,GeminiEmbeddingResponse). - Converter functions: Pure — no side effects, no logging, no HTTP.
- Pool names: Descriptive string passed to
pool.New()(e.g.,"channel-message","response-stream"). - Context keys: Use
BifrostContextKeytype. Custom plugins should define their own key types to avoid collisions. - Go filenames: No underscores. The only permitted underscore is the
_test.gosuffix. Examples:pluginpipeline.go,pluginpipeline_test.go— neverplugin_pipeline.goorplugin_pipeline_race_test.go. Concatenate words (lowercase, no separators) for multi-word filenames.
This document defines the standards, structure, and best practices for writing frontend code in this project.
- React (with Vite)
- TypeScript
- @tanstack/react-router (type-safe routing)
- Tailwind CSS v4
- Radix UI (primitives)
- Local UI component library (
ui/components/ui/) built on Radix primitives
/ui
├── app # Routes & pages
├── components # Shared components
│ └── ui # Core design system components
├── hooks # Custom React hooks
├── lib # Utilities, helpers, shared logic
└── app/enterprise # Enterprise-specific code (via symlink)
- All frontend code must live inside
/ui - Routes and pages →
ui/app - Shared/reusable components →
ui/components - Core UI primitives →
ui/components/ui - Utilities and libraries →
ui/lib - Custom hooks →
ui/hooks
react→ UI librarytypescript→ Type safetytailwindcss→ Styling@tanstack/react-router→ Routing
@radix-ui/react-*→ UI primitivesui/components/ui/*→ Project's Radix-based component systemrecharts→ Chartsmonaco-editor→ Code editor
date-fns→ Date/time formattingnuqs→ Query param state management
Oxfmt→ Code formattingvitest→ Testing
For every new route:
ui/app/<route-name>/
├── layout.tsx # Route definition using createFileRoute
├── page.tsx # Page content
└── views/ # Optional: route-specific components
- Folder name must match route name
- Always use
createFileRouteinlayout.tsx page.tsxshould only handle composition (not heavy logic)- Route-specific components go inside
views/
- Always check if similar components/functions already exist
- Prefer extending or refactoring existing code over duplication
- Only create new components if reuse is not feasible
- Shared →
ui/components - Route-specific →
views/inside route folder
- Avoid deeply nested conditional rendering
- Break complex UI into smaller components
- Keep components readable and maintainable
- Always use stable, unique keys
- Never use array index as key (unless unavoidable)
- Avoid unnecessary or unstable dependencies in hooks
- Prevent infinite loops in
useEffect - Keep dependency arrays accurate and minimal
- Prefer derived state over duplicated state
- Query Params (
nuqs) → for persistent/shareable state - Local State → for UI-only state
- Redux → only when truly necessary
- Use for state that should persist across refresh/navigation
- Use proper parsers like
parseAsStringorparseAsInteger - Do NOT mix query param state with local/redux state
- Follow a single consistent pattern across the codebase
- Use only when global/shared state is required
- Avoid unnecessary slices
- Prefer simpler alternatives when possible
- Use for API calls and caching
- Use granular tags for cache invalidation
- Avoid invalidating entire datasets unnecessarily
- Implement optimistic updates where applicable
We use:
react-hook-formzod v4(for schema validation)
- Always define a Zod schema
- Include meaningful validation messages
- Prefer inline field errors (not toast notifications)
- Use
refine/superRefinefor complex validation - Store schemas in:
ui/lib/types/schemas.ts
- Use
@tanstack/react-tableonly for large/complex datasets - For simple tables → build custom lightweight components
- Prioritize performance over abstraction
- Lazy load heavy or rarely-used libraries
- Avoid unnecessary re-renders
- Split large components into smaller ones
- Keep bundle size minimal
- Do NOT add new dependencies unless absolutely necessary
- Always pin exact versions (no
^or~) - Prefer existing libraries in the codebase
- Avoid using
anyunless absolutely unavoidable - Prefer strict typing and inference
- Define reusable types in shared locations
After writing code:
cd ui && npm run formatThen verify build:
cd ui && npm run build- Code must pass formatting and build checks
- Follow consistent naming and structure conventions
- Duplicate components without considering reuse
- Mixing multiple state management approaches unnecessarily
- Overusing Redux
- Using unstable hook dependencies
- Adding heavy libraries for simple use cases
- Poorly structured or deeply nested JSX
- Prioritize reusability, performance, and consistency
- Follow strict folder structure and routing conventions
- Use the right tool for the right problem
- Keep code simple, predictable, and maintainable