AI agents that control computers like humans do.
Browser automation ยท Terminal access ยท Desktop control ยท Multi-agent orchestration
Marketing โ Market your product on Reddit autonomously View chat session |
Go-to-Market โ Find prospects and send personalized emails View chat session |
QA Testing โ Test every checkout flow and report bugs View chat session |
Job Application โ Find roles, tailor your resume, and apply View chat session |
Form Filling โ Fill out the YC S26 application for you View chat session |
Social Media โ Post on Hacker News and engage with comments View chat session |
Open Computer Use is an open-source platform that gives AI agents real computer control. Unlike chatbots that only talk about tasks, agents here actually perform them โ browsing the web, running commands, clicking through UIs, and orchestrating multi-step workflows in isolated containers.
Computer use capabilities similar to Anthropic's Claude Computer Use, but fully open-source and extensible.
Browser โ Search-first web navigation, form filling, element interaction, multi-tab management, screenshot capture.
Terminal โ Command execution, file operations, script running, package management, output streaming.
Desktop โ Mouse & keyboard control, window management, screenshot analysis, UI element detection via computer vision.
Planner โ Decomposes complex requests into subtasks, assigns to specialized agents, passes context between steps.
Frontend (Next.js 15) Backend (FastAPI) VM (Docker)
โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ Chat UI โโโโโโถโ Multi-Agent Executor โโโโโโถโ Chrome Browser โ
โ Model Selector โ SSE โ โโ Planner Agent โ WS โ Terminal โ
โ VM Management โโโโโโโ โโ Browser Agent โโโโโโโ Desktop (XFCE) โ
โ Zustand Stores โ โ โโ Terminal Agent โ โ Agent Server โ
โโโโโโโโโโโโโโโโโโโโ โ โโ Desktop Agent โ โ VNC :5900 โ
โ WebSocket ยท DB ยท Billingโ โโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
Node.js 20+ ยท Python 3.10+ ยท Docker ยท Supabase account ยท AI provider API key
git clone https://github.com/coasty-ai/open-computer-use.git
cd open-computer-use
# Frontend
npm install
# Backend
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cd ..cp .env.example .env
cp backend/.env.example backend/.envSet these in both .env files:
# Supabase (required)
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE=your-service-role-key
# Security (required โ generate with: openssl rand -hex 32)
ENCRYPTION_KEY=...
CSRF_SECRET=...
# AI provider (at least one)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Google Search (required for web search)
GOOGLE_SEARCH_KEY=...
GOOGLE_SEARCH_CX=...# Via Supabase CLI
npm install -g supabase
supabase login
supabase link --project-ref your-project-ref
supabase db push
# Or paste supabase/schema.sql into the Supabase SQL EditorDocker (recommended):
docker-compose up --buildManual:
# Terminal 1 โ Frontend
npm run dev
# Terminal 2 โ Backend
cd backend && python main.py
# Terminal 3 โ AI Desktop VM (optional)
docker-compose -f docker-compose.ai-desktop.yml up --buildOpen http://localhost:3000, sign in, start a chat, and give your agent a task.
| Layer | Technologies |
|---|---|
| Frontend | Next.js 15, React 19, TypeScript, Tailwind CSS, Radix UI, Zustand, Vercel AI SDK |
| Backend | FastAPI, Python 3.10+, WebSockets, asyncio, uvicorn |
| AI Providers | OpenAI, Anthropic, Google, Azure, xAI, Mistral, Perplexity, OpenRouter |
| Infrastructure | Docker, Ubuntu 22.04 + XFCE, Chrome, Selenium, Supabase, Stripe |
| Desktop App | Electron 40, Puppeteer-core, platform-native automation (Win32 / CoreGraphics / xdotool) |
A lightweight overlay that runs AI agent commands directly on your local machine instead of a remote VM.
- Floating always-on-top pill UI with expanded chat panel
- Platform-native automation (PowerShell/Win32 on Windows, CoreGraphics/osascript on macOS, xdotool on Linux)
- Browser control via Puppeteer-core, shell execution, file operations
- WebSocket bridge to backend with auto-reconnect
cd electron
npm install
npm run dev # Development with hot reload
npm run package # Build for current platformโโโ app/ # Next.js routes & pages
โโโ components/ # React components (UI, chat, prompts)
โโโ lib/ # Stores, providers, services, utilities
โโโ backend/
โ โโโ app/
โ โโโ api/routes/ # FastAPI endpoints
โ โโโ services/ # Multi-agent executor, VM control, billing
โ โโโ providers/ # AI provider integrations
โ โโโ core/ # Config, middleware, logging
โโโ electron/
โ โโโ src/
โ โโโ main/ # App lifecycle, IPC, automation modules
โ โโโ preload/ # Context bridge API
โ โโโ renderer/ # React UI, stores, components
โโโ docker/ai-desktop/ # Ubuntu VM container
โโโ supabase/ # Database schema
- Fork the repo
- Create a branch:
git checkout -b feature/your-feature - Commit your changes
- Open a pull request
Bug reports and feature requests welcome in Issues.
- Multi-VM parallel orchestration
- Visual workflow builder
- Agent marketplace & templates
- Windows / macOS VM support
- Plugin system for custom tools
- Collaborative sessions
- Voice control & video understanding
This platform gives AI agents significant autonomy. Use it to automate repetitive tasks, testing, research, and content creation โ not to violate terms of service, spam, or scrape without permission. Always use isolated environments, respect robots.txt, and follow data protection laws.
Apache License 2.0 โ Copyright (c) 2025 Open Computer Use Contributors