Deployment Guide

Complete guide for deploying Perpendicularity in production environments.

📋 Table of Contents

Overview
Docker Deployment
Local Deployment
Cloud Deployment
Environment Configuration
Security
Monitoring
Scaling

🎯 Overview

Perpendicularity supports multiple deployment strategies:

Docker - Containerized deployment (recommended)
Local - Direct installation with uv/pip
AWS EC2 - Cloud deployment with GPU support
Kubernetes - Orchestrated deployment (advanced)

Recommended Stack:

Production: Docker on EC2 with Ollama
Development: Local installation with cloud models
Enterprise: Kubernetes with load balancing

🐳 Docker Deployment

Multi-Stage Build

The Dockerfile uses a multi-stage build to optimize image size and support optional local models.

Build Stages:

Frontend Builder - Build React app
Python Backend - Install Python dependencies
Local Models (optional) - Install transformers for HuggingFace models

Build Without Local Models (Recommended)

For cloud models only (Gemini, Claude, Ollama via network):

# Build image (no local models)
docker buildx build --platform linux/amd64 -t perpendicularity:0.1.0 .

# Image size: ~1.5GB

What's included:

✅ Python runtime
✅ Core dependencies (FastAPI, Click, LangChain)
✅ Frontend (React app)
✅ API server
❌ PyTorch and Transformers (not needed for cloud models)

Use when:

Using Gemini or Claude (cloud APIs)
Using Ollama (separate service)
Want smaller Docker images
Don't need HuggingFace Transformers

Build With Local Models

For HuggingFace Transformers (models loaded directly in container):

# Build with local models support
docker buildx build --platform linux/amd64 \
  --build-arg INSTALL_LOCAL_MODELS=true \
  -t perpendicularity:0.1.0-gpu .

# Image size: ~8GB (includes PyTorch, CUDA libraries)

What's included:

✅ Everything from base build
✅ PyTorch with CUDA support
✅ Transformers library
✅ GPU acceleration libraries

Use when:

Loading models with HuggingFace Transformers
Running models inside container (not via Ollama)
Need full offline capability
Have GPU available

Dockerfile Structure

# Stage 1: Build frontend
FROM node:20-alpine AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm ci
COPY frontend/ .
RUN npm run build

# Stage 2: Python dependencies
FROM python:3.11-slim AS base

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

# Install uv
RUN pip install uv

WORKDIR /app

# Copy Python project files
COPY pyproject.toml uv.lock ./
COPY agent/ agent/
COPY cli/ cli/
COPY api/ api/
COPY config/ config/

# Install Python dependencies (without local-models)
RUN uv sync --extra api

# Stage 3: Add local models support (conditional)
FROM base AS local-models
ARG INSTALL_LOCAL_MODELS=false

# Install transformers and PyTorch only if requested
RUN if [ "$INSTALL_LOCAL_MODELS" = "true" ]; then \
      uv sync --extra local-models --extra api; \
    fi

# Final stage
FROM local-models AS final

# Copy frontend build
COPY --from=frontend-builder /app/frontend/dist api/static

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8000/api/health || exit 1

# Run API server
CMD ["perpendicularity", "api", "--host", "0.0.0.0", "--port", "8000"]

Running the Container

Basic Run (Cloud Models)

# Run with default config
docker run -d \
  --name perpendicularity \
  -p 8000:8000 \
  -e GOOGLE_API_KEY="${GOOGLE_API_KEY}" \
  -e ANTHROPIC_API_KEY="${ANTHROPIC_API_KEY}" \
  perpendicularity:0.1.0

# Access at http://localhost:8000

Run with Custom Config

# Run with custom config file
docker run -d \
  --name perpendicularity \
  -p 8000:8000 \
  -v $(pwd)/config/agent_config.yaml:/app/config/agent_config.yaml:ro \
  -e GOOGLE_API_KEY="${GOOGLE_API_KEY}" \
  perpendicularity:0.1.0 \
  perpendicularity api --config /app/config/agent_config.yaml

Run with Ollama (Host Network)

# Use host network to access Ollama at localhost:11434
docker run -d \
  --name perpendicularity \
  --network host \
  -v $(pwd)/config/agent_config.yaml:/app/config/agent_config.yaml:ro \
  perpendicularity:0.1.0 \
  perpendicularity api --config /app/config/agent_config.yaml

# Access at http://localhost:8000

Why --network host?

Container can access Ollama at localhost:11434
Simpler than bridge networking for local services
Container ports bind directly to host

Run with GPU (Local Models)

# Run with GPU access for HuggingFace models
docker run -d \
  --name perpendicularity \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/config/agent_config.yaml:/app/config/agent_config.yaml:ro \
  perpendicularity:0.1.0-gpu \
  perpendicularity api --config /app/config/agent_config.yaml

# Verify GPU access
docker exec perpendicularity nvidia-smi

Requirements:

NVIDIA GPU
nvidia-docker2 installed
Image built with --build-arg INSTALL_LOCAL_MODELS=true

Advanced Docker Options

Build Optimization

# Use BuildKit for faster builds
DOCKER_BUILDKIT=1 docker buildx build --platform linux/amd64 -t perpendicularity:0.1.0 .

# Multi-platform build (for deployment on different architectures)
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t perpendicularity:0.1.0 .

# Build with specific Python version
docker buildx build --platform linux/amd64 \
  --build-arg PYTHON_VERSION=3.11 \
  -t perpendicularity:0.1.0 .

Cache Management

# Build with no cache
docker buildx build --platform linux/amd64 --no-cache -t perpendicularity:0.1.0 .

# Use BuildKit cache
docker buildx build --platform linux/amd64 \
  --cache-from perpendicularity:latest \
  -t perpendicularity:0.1.0 .

Resource Limits

# Limit CPU and memory
docker run -d \
  --name perpendicularity \
  --cpus="2.0" \
  --memory="4g" \
  --memory-swap="4g" \
  -p 8000:8000 \
  perpendicularity:0.1.0

💻 Local Deployment

Production Installation

Using uv (Recommended):

# Clone repository
git clone https://github.com/t-neumann/perpendicularity.git
cd perpendicularity

# Install with production extras
uv sync --extra api

uv run perpendicularity --help

☁️ Cloud Deployment

AWS EC2 (Recommended)

Complete EC2 setup with Ollama - see EC2 Setup Guide for detailed instructions.

Quick Overview:

# 1. Launch EC2 instance (g5.xlarge for GPU)
# 2. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b-instruct

# 3. Deploy Perpendicularity
git clone https://github.com/t-neumann/perpendicularity.git
cd perpendicularity
docker buildx build --platform linux/amd64 -t perpendicularity:0.1.0 .
docker run -d --network host \
  -v $(pwd)/config/agent_config.yaml:/app/config/agent_config.yaml:ro \
  perpendicularity:0.1.0

# 4. Access at http://EC2_PUBLIC_IP:8000

⚙️ Environment Configuration

Environment Variables

Required for cloud models:

# Gemini API
export GOOGLE_API_KEY="AIza..."

# Claude API
export ANTHROPIC_API_KEY="sk-ant-..."

# Optional: Override config path
export PERPENDICULARITY_CONFIG="/path/to/config.yaml"

Configuration Files

Production config structure:

config/
├── agent_config.yaml          # Main configuration
├── agent_config.prod.yaml     # Production overrides
├── agent_config.dev.yaml      # Development
└── prompts.yaml               # System prompts

Production config example:

# config/agent_config.prod.yaml

default_model: "ollama_qwen32b"  # High-quality local model

models:
  defaults:
    openai:
      base_url: "http://localhost:11434/v1"
  
  # Production-grade local model
  ollama_qwen32b:
    type: "openai"
    name: "qwen2.5:32b-instruct"

agent:
  type: "langgraph"
  recursion_limit: 25
  verbose: false  # Disable verbose in production

mcp_servers:
  genomic_ops:
    url: "http://genomic-server.internal:8000/mcp"
    timeout: 180
  
  txgemma:
    url: "http://txgemma-server.internal:8001/mcp"
    timeout: 180

logging:
  level: "WARNING"  # Only warnings and errors

Environment-Specific Deployment

# Development
perpendicularity api --config config/agent_config.dev.yaml

# Staging
perpendicularity api --config config/agent_config.staging.yaml

# Production
perpendicularity api --config config/agent_config.prod.yaml --workers 4

HTTPS/TLS

Production must use HTTPS!

Option 1: Nginx Reverse Proxy

# /etc/nginx/sites-available/perpendicularity

server {
    listen 80;
    server_name perpendicularity.example.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name perpendicularity.example.com;

    ssl_certificate /etc/letsencrypt/live/perpendicularity.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/perpendicularity.example.com/privkey.pem;

    location / {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Get SSL certificate:

# Using Let's Encrypt
sudo certbot --nginx -d perpendicularity.example.com

📊 Monitoring

Health Checks

Built-in endpoint:

# Check API health
curl http://localhost:8000/api/health

# Response:
{
  "status": "healthy",
  "service": "perpendicularity-api"
}

Logging

Configure logging in production:

# config/agent_config.yaml

logging:
  level: "INFO"
  format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
  file: "/var/log/perpendicularity/api.log"

📈 Scaling

Horizontal Scaling

Multiple workers:

# Run with multiple workers
perpendicularity api --workers 4

# Or in Docker
docker run -d \
  -p 8000:8000 \
  perpendicularity:0.1.0 \
  perpendicularity api --workers 4

🚀 Quick Deploy Commands

Development (Local)

# Clone and run locally
git clone https://github.com/t-neumann/perpendicularity.git
cd perpendicularity
uv sync --extra api
export GOOGLE_API_KEY="your-key"
perpendicularity api --reload

Production (Docker)

# Build and run in Docker
docker buildx build --platform linux/amd64 -t perpendicularity:0.1.0 .
docker run -d \
  --name perpendicularity \
  --restart unless-stopped \
  -p 8000:8000 \
  -e GOOGLE_API_KEY="${GOOGLE_API_KEY}" \
  perpendicularity:0.1.0

Production (EC2 + Ollama)

# On EC2 instance
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b-instruct

git clone https://github.com/t-neumann/perpendicularity.git
cd perpendicularity
docker buildx build --platform linux/amd64 -t perpendicularity:0.1.0 .
docker run -d --network host perpendicularity:0.1.0

See EC2 Setup Guide for complete instructions.

📚 See Also

EC2 Setup - Detailed AWS deployment
Configuration - Config reference
Troubleshooting - Common issues
API Guide - API reference
Frontend Guide - Frontend deployment

Deploy Perpendicularity with confidence! 🚀

For questions, see Troubleshooting or open an issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment Guide

📋 Table of Contents

🎯 Overview

🐳 Docker Deployment

Multi-Stage Build

Build Without Local Models (Recommended)

Build With Local Models

Dockerfile Structure

Running the Container

Basic Run (Cloud Models)

Run with Custom Config

Run with Ollama (Host Network)

Run with GPU (Local Models)

Advanced Docker Options

Build Optimization

Cache Management

Resource Limits

💻 Local Deployment

Production Installation

☁️ Cloud Deployment

AWS EC2 (Recommended)

⚙️ Environment Configuration

Environment Variables

Configuration Files

Environment-Specific Deployment

HTTPS/TLS

📊 Monitoring

Health Checks

Logging

📈 Scaling

Horizontal Scaling

🚀 Quick Deploy Commands

Development (Local)

Production (Docker)

Production (EC2 + Ollama)

📚 See Also

FilesExpand file tree

DEPLOYMENT.md

Latest commit

History

DEPLOYMENT.md

File metadata and controls

Deployment Guide

📋 Table of Contents

🎯 Overview

🐳 Docker Deployment

Multi-Stage Build

Build Without Local Models (Recommended)

Build With Local Models

Dockerfile Structure

Running the Container

Basic Run (Cloud Models)

Run with Custom Config

Run with Ollama (Host Network)

Run with GPU (Local Models)

Advanced Docker Options

Build Optimization

Cache Management

Resource Limits

💻 Local Deployment

Production Installation

☁️ Cloud Deployment

AWS EC2 (Recommended)

⚙️ Environment Configuration

Environment Variables

Configuration Files

Environment-Specific Deployment

HTTPS/TLS

📊 Monitoring

Health Checks

Logging

📈 Scaling

Horizontal Scaling

🚀 Quick Deploy Commands

Development (Local)

Production (Docker)

Production (EC2 + Ollama)

📚 See Also