A monolithic service that combines the Management API and UI for the AI/ML inference platform.
This service consolidates the previously separate Management API and UI components into a single deployment:
- Backend: API server
- Frontend: React application served as static files
- Deployment: Single Docker container and Kubernetes deployment
┌─────────────────────────────────────────┐
│ Management Service │
│ │
│ ┌─────────────────┐ ┌─────────────────┐│
│ │ Backend API │ │ React UI ││
│ │ │ │ (static) ││
│ │ • /api/* │ │ • /* ││
│ │ • Authentication│ │ • Dashboard ││
│ │ • Model CRUD │ │ • Model Forms ││
│ │ • Inference │ │ • Testing ││
│ └─────────────────┘ └─────────────────┘│
│ │
│ Port 8080 │
└─────────────────────────────────────────┘
GET /health- Health checkGET /api/tokens- Get JWT tokensGET /api/models- List modelsPOST /api/models- Create modelPUT /api/models/:name- Update modelDELETE /api/models/:name- Delete modelPOST /api/models/:name/predict- Make predictionsGET /api/models/:name/logs- Get model logsGET /api/tenant- Get tenant infoGET /api/frameworks- List supported frameworks
- Dashboard: Tabbed interface for model management
- Model List: CRUD operations with real-time status
- Model Form: Create/edit models with framework selection
- Inference Testing: Interactive testing interface
- Authentication: JWT-based login with tenant selection
-
Install dependencies:
go mod download
-
Build UI:
cd ui && npm install && npm run build
-
Start server:
go run . -
Development mode (with automatic restart):
# Install air for hot reload go install github.com/cosmtrek/air@latest air
management/
├── main.go # Main entry point
├── config.go # Configuration management
├── types.go # Type definitions
├── auth.go # Authentication service
├── models.go # Model management service
├── admin.go # Admin service
├── k8s.go # Kubernetes client
├── server.go # HTTP server and routing
├── utils.go # Utility functions
├── go.mod # Go module dependencies
├── go.sum # Go module checksums
├── Dockerfile # Docker build
├── README.md # This file
└── ui/ # React UI (unchanged)
├── package.json # UI dependencies
├── public/ # Static assets
│ ├── index.html
│ └── manifest.json
└── src/ # React source
├── App.js
├── index.js
├── index.css
├── components/ # React components
│ ├── Dashboard.js
│ ├── Login.js
│ ├── ModelList.js
│ ├── ModelForm.js
│ └── InferenceTest.js
└── contexts/ # React contexts
├── AuthContext.js
└── ApiContext.js
docker build -t management-service:latest .docker run -d -p 8080:8080 --name management-service management-service:latestThe Dockerfile uses a multi-stage build:
- Stage 1: Build React UI with Node.js
- Stage 2: Build backend binary
- Stage 3: Create minimal runtime image with Alpine Linux
kubectl apply -f ../configs/management/management.yamlkubectl port-forward svc/management-service 8085:80scripts/build-management.sh- Build and deploy backendscripts/deploy-management.sh- Deploy to Kubernetesscripts/build-and-push-images.sh- Build and push multi-arch Docker imagesscripts/build-local-images.sh- Build Docker images locally
# Build and deploy
./scripts/build-management.sh
# Deploy only
./scripts/deploy-management.sh
# Build and push Docker images
./scripts/build-and-push-images.sh
# Build local Docker images
./scripts/build-local-images.shPORT- Server port (default: 8080)NODE_ENV- Environment (production/development)
The service uses JWT tokens for authentication:
- Tokens are validated for each API request
- Tenant information is extracted from JWT claims
- Multi-tenant isolation is enforced
This consolidated service replaces the separate management-api and management-ui services:
management-api/ → Port 8082 (Backend API)
management-ui/ → Port 80 (via Nginx)
management/ → Port 8080 (Backend + React)
- Simplified deployment: Single service instead of two
- Reduced complexity: One Docker image, one Kubernetes deployment
- Better performance: No network overhead between API and UI, faster backend
- Easier maintenance: Single codebase and deployment pipeline
- Improved resource efficiency: Lower memory footprint and faster startup
- Better concurrency: Efficient concurrent request handling
- JWT-based authentication
- RBAC for Kubernetes access
- Non-root container user
- Health checks and resource limits
- CORS configuration for API access
- Health endpoint:
/health - Kubernetes liveness/readiness probes
- Resource monitoring and limits
- Logging to stdout/stderr
-
Service not starting:
kubectl logs deployment/management-service
-
UI not loading:
- Check if UI build artifacts are present
- Verify port-forward is active
- Check browser console for errors
-
API not responding:
- Check authentication tokens
- Verify Kubernetes RBAC permissions
- Check service endpoints
# Check deployment status
kubectl get deployment management-service
# View logs
kubectl logs -f deployment/management-service
# Check service endpoints
kubectl get endpoints management-service
# Port forward for local access
kubectl port-forward svc/management-service 8085:80
# Check health
curl http://localhost:8085/healthThe backend has been refactored while maintaining 100% API compatibility:
- Performance: Improved memory usage and startup time
- Resource Requirements: Reduced from 256Mi to 128Mi memory
- Concurrency: Better concurrent request handling
- React Frontend: Completely unchanged
- API Endpoints: 100% compatible
- Authentication: JWT handling identical
- Docker Deployment: Same deployment process
- UI Functionality: All features work identically