📋 Navigation: 🏠 Main README • 🎯 Goals & Vision • 📖 Usage Guide • 🏗️ Architecture • 🤖 AI Assistant
This guide provides detailed step-by-step instructions to get the platform up and running.
📖 Quick Reference: For a condensed quick start, see the Quick Start section in README.md 🎯 Why This Project: To understand the goals and vision behind this platform, start with GOALS.md
📋 System Requirements: For detailed system requirements and tool versions, see Prerequisites in README.md
Ensure you have the following tools installed:
- Docker 20.10+ with Kubernetes enabled
- kubectl 1.24+
- Kind 0.20+
- Helm 3.12+
- curl and jq (for API testing)
# Clone the repository
git clone https://github.com/smarunich/inference-in-a-box.git
cd inference-in-a-box
# One-command bootstrap (takes 10-15 minutes)
./scripts/bootstrap.sh🔧 What Bootstrap Does: For detailed information about what the bootstrap script installs, see Technology Stack
# Check cluster is ready
kubectl get nodes
# Verify core components
kubectl get pods -A | grep -E "(istio|envoy|kserve|knative)"
# Check sample models are deployed
kubectl get inferenceservice -A# Get JWT tokens for different tenants
./scripts/get-jwt-tokens.sh
# This creates tokens for tenant-a, tenant-b, and tenant-c
export TENANT_A_TOKEN="<token-from-script>"🌐 Service Access: For complete service access information and port forwarding commands, see Usage Guide
# Access management UI
kubectl port-forward svc/management-service 8085:80
# Open browser: http://localhost:8085🎭 Complete Demo Guide: For comprehensive demo scenarios and explanations, see demo.md
# Interactive demo with multiple scenarios
./scripts/demo.sh📝 Complete API Guide: For detailed API usage and examples, see Usage Guide
Test the sklearn-iris model:
# Get your JWT token first
export JWT_TOKEN=$(./scripts/get-jwt-tokens.sh | grep "tenant-a" | cut -d' ' -f2)
# Make inference request
curl -H "Authorization: Bearer $JWT_TOKEN" \
-H "x-ai-eg-model: sklearn-iris" \
http://localhost:8080/v1/models/sklearn-iris:predict \
-d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'After successful setup, explore these key areas:
📘 Complete Guide: See Model Publishing Guide
Use the Management Service to publish models for external access with rate limiting and authentication.
🏗️ Technical Details: See Architecture Documentation
Learn about the dual-gateway design and multi-tenant security.
⚡ API Reference: See Management Service API
Explore the full REST API for programmatic model management.
# Complete platform teardown
./scripts/cleanup.sh🔧 Complete Troubleshooting: For detailed troubleshooting steps, see README.md - Troubleshooting
Common verification commands:
# Check cluster health
kubectl get pods --all-namespaces | grep -v Running
# Verify AI Gateway
kubectl get pods -n envoy-gateway-system