Scenario: You've SSH'd into the fresh Ubuntu Server from your Mac. Do these steps in order.
sudo apt update && sudo apt install -y nvidia-driver-570
sudo rebootWait 30 seconds, SSH back in, then verify:
nvidia-smiShould show RTX 5090, 32GB.
# Docker
curl -fsSL https://get.docker.com | sudo sh
sudo usermod -aG docker $USER
newgrp docker
# NVIDIA Container Toolkit
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update && sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
# Verify GPU in Docker
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu24.04 nvidia-smisudo apt install -y python3.12 python3-pip python3.12-venv git curl# Install Node.js (required for Claude Code)
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt install -y nodejs
# Install Claude Code
npm install -g @anthropic-ai/claude-code
# Login (follow the browser auth flow — it will give you a URL to open on your Mac)
claude# Set up SSH key for GitHub (if not already)
ssh-keygen -t ed25519 -C "your@email.com"
cat ~/.ssh/id_ed25519.pub
# Copy the output and add it at: https://github.com/settings/ssh/new
# Clone
git clone git@github.com:JoshuaPubNub/vibercoded.git ~/vibercoded
cd ~/vibercoded
# Install Python dependencies
pip install -e ".[dev]"cd ~/vibercoded/gpu-machine
docker compose up -dWait for model to load (~2-3 minutes):
until curl -s http://localhost:8000/health > /dev/null 2>&1; do sleep 5; echo "waiting..."; done
echo "vLLM ready!"Test it:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"Kbenkhaled/Qwen3.5-27B-NVFP4","messages":[{"role":"user","content":"Hello"}],"max_tokens":10}'cd ~/vibercoded
nohup python3 -c "from agent_creator.main import main; main()" > data/server.log 2>&1 &Verify:
curl http://localhost:7000/api/llm/healthcurl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale upFollow the auth URL. After that, your Tailscale IP is:
tailscale ip -4Team accesses UI at: http://YOUR_TAILSCALE_IP:7000
cd ~/vibercoded
python3 scripts/build_all_agents.py --output data/build_report.mdcd ~/vibercoded
python3 tests/integration/test_runner.py --output data/test_report.md# SSH in
ssh joshua@100.x.x.x
# Check everything is running
nvidia-smi # GPU status
docker ps # Running containers
curl http://localhost:8000/health # vLLM health
curl http://localhost:7000/api/llm/health # AgentCreator health
# Start services after reboot
cd ~/vibercoded/gpu-machine && docker compose up -d # vLLM
cd ~/vibercoded && nohup python3 -c "from agent_creator.main import main; main()" > data/server.log 2>&1 & # AgentCreator
# View logs
tail -f ~/vibercoded/data/server.log # AgentCreator logs
docker logs gpu-machine-vllm-1 --tail 20 # vLLM logs
# Stop everything
pkill -f agent_creator # Stop AgentCreator
docker stop $(docker ps -q) # Stop all containers
# Run Claude Code
cd ~/vibercoded && claude| Windows (old) | Linux (new) | |
|---|---|---|
| Inference speed | 3-5 tok/s | 60-80 tok/s |
| GPU clock lock | Needed every reboot | Not needed |
| pin_memory | Disabled (WSL) | Enabled (native) |
| Docker GPU | Via WSL2 layer | Direct |
| Access | Desktop + JumpDesktop | SSH + Tailscale |
| Agent build time (197) | 4-6 hours | ~45 minutes |