ToolHive supports three distinct deployment modes, each optimized for different use cases and environments. This document provides a detailed explanation of how ToolHive operates in each mode.
graph LR
subgraph LocalDeployment[Local Deployment]
CLI[CLI Mode<br/>thv binary]
UI[UI Mode<br/>ToolHive Studio]
end
subgraph KubernetesDeployment[Kubernetes Deployment]
Operator[Operator Mode<br/>thv-operator]
end
CLI --> Docker[Docker/Podman<br/>Colima<br/>Rancher Desktop]
UI --> Docker
Operator --> K8s[Kubernetes]
Docker --> MCP1[MCP Servers]
K8s --> MCP2[MCP Servers]
style LocalDeployment fill:#e1f5fe
style KubernetesDeployment fill:#fff3e0
| Feature | Local CLI | Local UI | Kubernetes |
|---|---|---|---|
| Binary | thv |
thv (API server) |
thv-operator + thv-proxyrunner |
| Container Runtime | Docker/Podman/Colima/Rancher | Docker/Podman/Colima/Rancher | Kubernetes |
| Process Management | Detached processes | API-managed | Operator-managed |
| State Storage | Local filesystem | Local filesystem | etcd (K8s API) |
| Scaling | Single machine | Single machine | Cluster-wide |
| Best For | Developers, CLI users | UI users, beginners | Production, multi-tenant |
graph TB
User[User] -->|CLI Commands| THV[thv binary]
THV -->|spawn detached| Proxy[Proxy Process]
Proxy -->|Docker API| Runtime[Container Runtime<br/>Docker/Podman/Colima]
Runtime -->|creates| Container[MCP Server Container]
Proxy -->|stdin/stdout or HTTP| Container
Client[MCP Client] -->|HTTP/SSE/Streamable| Proxy
style THV fill:#90caf9
style Proxy fill:#81c784
style Container fill:#ffb74d
-
User executes command:
thv run server-name -
ToolHive CLI (
cmd/thv/main.go):- Parses command-line arguments
- Loads or creates RunConfig
- Instantiates workloads API (
pkg/workloads/manager.go)
-
Workload Manager:
- Detects available container runtime (Podman → Colima → Docker)
- Creates container via Runtime API
- Spawns detached proxy process
-
Proxy Process:
- Runs as independent process (via
thv restart --foreground) - Attaches to container (for stdio) or forwards HTTP traffic
- Applies middleware chain
- Exposes local HTTP endpoint for MCP clients
- Runs as independent process (via
-
State Management:
- RunConfig saved to
~/.toolhive/state/(or XDG equivalent) - PID file for process management
- Status file for workload state tracking
- RunConfig saved to
Implementation: pkg/container/factory.go
The CLI automatically detects container runtimes in this order:
-
Podman - Checks for Podman socket at:
$TOOLHIVE_PODMAN_SOCKET(if set)/var/run/podman/podman.sock$XDG_RUNTIME_DIR/podman/podman.sock~/.local/share/containers/podman/machine/podman.sock(Podman Machine on macOS)$TMPDIR/podman/*-api.sock(Podman Machine API on macOS)
-
Colima - Checks for Colima socket at:
$TOOLHIVE_COLIMA_SOCKET(if set)~/.colima/default/docker.sock
-
Docker (including Docker Desktop, Rancher Desktop, and OrbStack) - Checks for Docker socket at:
$TOOLHIVE_DOCKER_SOCKET(if set)/var/run/docker.sock~/.docker/run/docker.sock(Docker Desktop on macOS)~/.rd/docker.sock(Rancher Desktop on macOS)~/.orbstack/run/docker.sock(OrbStack on macOS)
When running in detached mode (thv run without --foreground):
sequenceDiagram
participant User
participant THV as thv (parent)
participant THV2 as thv restart<br/>(detached child)
participant Container
User->>THV: thv run server-name
THV->>THV: Save RunConfig to state
THV->>THV2: Fork: thv restart --foreground
Note over THV2: Detached process<br/>with new session
THV->>User: Return (PID written)
THV2->>Container: Attach or proxy
Container->>THV2: MCP traffic
THV2->>THV2: Apply middleware
Note over THV2: Runs indefinitely
Key Implementation:
pkg/workloads/manager.go-RunWorkloadDetachedmethod- Uses
exec.CommandwithSysProcAttrto detach - Sets
TOOLHIVE_DETACHED=trueenvironment variable - Redirects stdout/stderr to log file:
~/.toolhive/logs/<workload>.log
| Purpose | Path (Linux) | Path (macOS) |
|---|---|---|
| State files (RunConfig) | ~/.local/state/toolhive/ |
~/Library/Application Support/toolhive/ |
| Data files (logs, PIDs, secrets, statuses) | ~/.local/share/toolhive/ |
~/Library/Application Support/toolhive/ |
| Config files | ~/.config/toolhive/ |
~/Library/Application Support/toolhive/ |
| Cache files | ~/.cache/toolhive/ |
~/Library/Caches/toolhive/ |
Implementation: Uses adrg/xdg package for XDG Base Directory compliance.
graph TB
User[User] -->|Web Browser| Studio[ToolHive Studio<br/>Web UI]
Studio -->|REST API| APIServer[thv serve<br/>API Server]
APIServer -->|Internal| Workloads[Workloads Manager]
Workloads -->|Runtime API| Runtime[Container Runtime<br/>Docker/Podman/Rancher]
Runtime -->|creates| Container[MCP Server Container]
Container -->|managed by| Proxy[Proxy Process]
Client[MCP Client] -->|HTTP| Proxy
style Studio fill:#ba68c8
style APIServer fill:#90caf9
style Proxy fill:#81c784
style Container fill:#ffb74d
-
User starts UI: ToolHive Studio application launches
-
Studio spawns API server:
thv serve- Starts HTTP API server on configurable port (default: 8080)
- Exposes RESTful endpoints for workload management
-
API Server (
pkg/api/server.go):- Handles HTTP requests from UI
- Delegates to Workloads Manager
- Returns JSON responses
-
Workload Operations:
- Create:
POST /api/v1beta/workloads - List:
GET /api/v1beta/workloads - Stop:
POST /api/v1beta/workloads/{name}/stop - Delete:
DELETE /api/v1beta/workloads/{name} - Logs:
GET /api/v1beta/workloads/{name}/logs
- Create:
-
Runtime Selection:
- Picks runtime driver based on environment
- Docker, Podman, or Rancher Desktop
- Uses driver API to spawn containers
Full API documentation available at:
- OpenAPI spec:
pkg/api/openapi.go - Interactive docs:
http://localhost:8080/api/doc(Scalar UI)
Key endpoints:
/api/v1beta/workloads- Workload management/api/v1beta/registry- Registry browsing/api/v1beta/clients- Client configuration/api/v1beta/groups- Group management
| Aspect | CLI Mode | UI Mode |
|---|---|---|
| Process Model | Detached child process | Managed by API server |
| State Access | Direct filesystem | Via API server |
| Authentication | None (local user) | Optional (configurable) |
| Middleware Config | CLI flags or config file | API requests |
| Runtime Selection | Automatic detection | User selectable in UI |
graph TB
User[User] -->|kubectl apply| K8s[Kubernetes API]
K8s -->|watch| Operator[thv-operator]
Operator -->|create| Deploy[Deployment<br/>thv-proxyrunner]
Operator -->|create| SVC[Service]
Deploy -->|create| STS[StatefulSet<br/>MCP Server]
Deploy -->|proxy to| STS
Client[MCP Client] -->|HTTP| SVC
SVC -->|route to| Deploy
style Operator fill:#5c6bc0
style Deploy fill:#90caf9
style STS fill:#ffb74d
-
User applies CRD:
kubectl apply -f mcpserver.yaml -
Operator watches resources (
cmd/thv-operator/controllers/mcpserver_controller.go):- Watches for
MCPServercustom resources - Reconciles desired state vs actual state
- Watches for
-
Operator creates Deployment:
- Runs
thv-proxyrunnercontainer - Mounts RunConfig as ConfigMap or secret
- Applies middleware configuration
- Runs
-
Proxy runner creates StatefulSet:
- Uses Kubernetes API (in-cluster client)
- Creates StatefulSet with MCP server container
- Manages container lifecycle
-
Proxy runner proxies traffic:
- Receives requests on exposed port
- Applies middleware chain
- Forwards to StatefulSet pod(s)
-
Operator creates Service:
- Exposes proxy runner Deployment
- LoadBalancer, ClusterIP, or NodePort
- Routes external traffic to proxy
thv-operator (cmd/thv-operator/):
- Watches Kubernetes API for CRDs
- Reconciles desired vs actual state
- Creates Kubernetes resources (Deployments, Services, ConfigMaps)
- Does NOT run the proxy or create containers directly
thv-proxyrunner (cmd/thv-proxyrunner/):
- Runs as a container in the Deployment
- Creates containers via Kubernetes API (StatefulSets)
- Applies middleware and proxies MCP traffic
- Handles transport-specific communication
Why not use thv in Kubernetes?
thvis optimized for local Docker/Podman API usage- Kubernetes requires different container creation logic (StatefulSets vs standalone containers)
- Separation of concerns: operator manages K8s resources, proxy-runner manages MCP traffic
graph LR
subgraph "Namespace: default"
Deploy[Deployment<br/>proxy-runner<br/>Replicas: 1]
SVC[Service<br/>proxy-svc]
STS[StatefulSet<br/>mcp-server<br/>Replicas: 1]
end
Deploy -->|manages| STS
SVC -->|routes to| Deploy
Deploy -.->|watches| STS
style Deploy fill:#90caf9
style STS fill:#ffb74d
style SVC fill:#81c784
ToolHive provides several CRDs for managing MCP servers in Kubernetes:
- MCPServer - Defines an MCP server deployment with container images, transports, and middleware
- MCPRegistry - Manages MCP server registries from Git or ConfigMap sources
For complete examples, see the examples/operator/mcp-servers/ directory, which includes:
- Basic MCP server deployments with different transports (stdio, SSE, streamable-http)
- Authentication configurations (inline OIDC, ConfigMap-based, Kubernetes-native)
- Resource and pod template customizations
- Tool filtering and middleware examples
Full CRD API documentation is available in docs/operator/crd-api.md.
See cmd/thv-operator/DESIGN.md for detailed decision documentation.
Key principles:
- Use CRD attributes for business logic affecting reconciliation
- Use PodTemplateSpec for infrastructure concerns (node selection, resources)
- Separate sync decision logic from sync execution
- Batch status updates to reduce API server load
Unlike local mode, Kubernetes mode stores state in:
- etcd (via Kubernetes API)
- ConfigMaps for RunConfig
- Secrets for sensitive data (OIDC client secrets, etc.)
- Status subresources for workload state
No local filesystem state required.
Proxy runner:
- Typically runs with 1 replica
- Multiple replicas may be possible with session affinity (not currently tested)
- Note: stdio transport requires single proxy instance due to exclusive stdin/stdout attachment
MCP server (StatefulSet):
- Scales independently from proxy (for SSE/Streamable HTTP transports)
- Stable network identities
- Persistent storage can be configured if needed
Operator:
- Single instance with leader election
- Watches cluster-wide or namespace-scoped
The workloads API (pkg/workloads/manager.go) provides a unified interface across all modes:
type Manager interface {
RunWorkload(ctx context.Context, runConfig *runner.RunConfig) error
RunWorkloadDetached(ctx context.Context, runConfig *runner.RunConfig) error
StopWorkloads(ctx context.Context, names []string) (*errgroup.Group, error)
DeleteWorkloads(ctx context.Context, names []string) (*errgroup.Group, error)
ListWorkloads(ctx context.Context, listAll bool, labelFilters ...string) ([]core.Workload, error)
GetWorkload(ctx context.Context, workloadName string) (core.Workload, error)
// ... more methods
}Mode-specific behavior is abstracted through:
- Runtime interface (
pkg/container/runtime/types.go) - Factory pattern for runtime selection (
pkg/container/factory.go)
classDiagram
class Runtime {
<<interface>>
+DeployWorkload()
+StopWorkload()
+RemoveWorkload()
+ListWorkloads()
+GetWorkloadInfo()
}
class DockerRuntime {
+DeployWorkload()
+StopWorkload()
...
}
class KubernetesRuntime {
+DeployWorkload()
+StopWorkload()
...
}
Runtime <|-- DockerRuntime
Runtime <|-- KubernetesRuntime
Implementation files:
- Docker:
pkg/container/docker/(implementation details in Docker engine integration) - Kubernetes: Operator uses Kubernetes API directly, not the Runtime interface
The RunConfig format (pkg/runner/config.go) is designed to be portable across all modes:
Local → Local: Direct JSON export/import via:
thv export <workload> <output-file>→ saves RunConfig JSONthv run --from-config <file>→ loads RunConfig JSON
Local → Kubernetes: Manual conversion:
- Export RunConfig from local workload
- Convert to MCPServer CRD YAML (tool support planned)
- Apply to cluster
Kubernetes → Kubernetes: Direct CRD replication
Implementation: pkg/container/runtime/types.go
ToolHive automatically detects runtime environment:
func IsKubernetesRuntime() bool {
// Check TOOLHIVE_RUNTIME env var
if runtimeEnv := os.Getenv("TOOLHIVE_RUNTIME"); runtimeEnv == "kubernetes" {
return true
}
// Check if running in K8s pod
return os.Getenv("KUBERNETES_SERVICE_HOST") != ""
}This allows the same codebase to behave appropriately in different environments.
- Developing MCP servers locally
- Quick testing and iteration
- Single-user environment
- No need for web UI
- Non-technical users need access
- Visual management preferred
- Local development with GUI
- Multiple users on same machine (API can be shared)
- Production deployments
- Multi-tenant requirements
- Need horizontal scaling
- HA and resilience required
- Integration with existing K8s infrastructure
- Centralized management of many MCP servers
- Export RunConfig:
thv export my-server runconfig.json - Convert to MCPServer CRD (manual or tool-assisted)
- Apply to cluster:
kubectl apply -f mcpserver.yaml
- Get MCPServer spec:
kubectl get mcpserver my-server -o yaml - Extract relevant fields to RunConfig format
- Import locally:
thv run --from-config runconfig.json
- Core Concepts - Workloads, transports, and more
- Transport Architecture - How proxying works
- RunConfig and Permissions - Configuration format
- Operator Architecture - Kubernetes operator details