Skip to content

coasty-ai/open-computer-use

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

317 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
Coasty

Open Computer Use

AI agents that control computers like humans do.

Browser automation ยท Terminal access ยท Desktop control ยท Multi-agent orchestration


Website ยท Discord ยท Twitter


License PRs Welcome




See it in action

Marketing on Reddit
Marketing โ€” Market your product on Reddit autonomously
View chat session
Go-to-Market Outreach
Go-to-Market โ€” Find prospects and send personalized emails
View chat session
QA Testing
QA Testing โ€” Test every checkout flow and report bugs
View chat session
Job Application
Job Application โ€” Find roles, tailor your resume, and apply
View chat session
Form Filling
Form Filling โ€” Fill out the YC S26 application for you
View chat session
Social Media
Social Media โ€” Post on Hacker News and engage with comments
View chat session



What is this?

Open Computer Use is an open-source platform that gives AI agents real computer control. Unlike chatbots that only talk about tasks, agents here actually perform them โ€” browsing the web, running commands, clicking through UIs, and orchestrating multi-step workflows in isolated containers.

Computer use capabilities similar to Anthropic's Claude Computer Use, but fully open-source and extensible.




Agents

Browser โ€” Search-first web navigation, form filling, element interaction, multi-tab management, screenshot capture.

Terminal โ€” Command execution, file operations, script running, package management, output streaming.

Desktop โ€” Mouse & keyboard control, window management, screenshot analysis, UI element detection via computer vision.

Planner โ€” Decomposes complex requests into subtasks, assigns to specialized agents, passes context between steps.




Architecture

Frontend (Next.js 15)         Backend (FastAPI)              VM (Docker)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Chat UI         โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Multi-Agent Executor    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Chrome Browser  โ”‚
โ”‚  Model Selector  โ”‚ SSE โ”‚  โ”œโ”€ Planner Agent        โ”‚ WS  โ”‚  Terminal        โ”‚
โ”‚  VM Management   โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚  โ”œโ”€ Browser Agent        โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚  Desktop (XFCE)  โ”‚
โ”‚  Zustand Stores  โ”‚     โ”‚  โ”œโ”€ Terminal Agent       โ”‚     โ”‚  Agent Server    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ”‚  โ””โ”€ Desktop Agent        โ”‚     โ”‚  VNC :5900       โ”‚
                         โ”‚  WebSocket ยท DB ยท Billingโ”‚     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜



Quick Start

Prerequisites

Node.js 20+ ยท Python 3.10+ ยท Docker ยท Supabase account ยท AI provider API key

1. Clone & install

git clone https://github.com/coasty-ai/open-computer-use.git
cd open-computer-use

# Frontend
npm install

# Backend
cd backend
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install -r requirements.txt
cd ..

2. Configure environment

cp .env.example .env
cp backend/.env.example backend/.env

Set these in both .env files:

# Supabase (required)
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
SUPABASE_SERVICE_ROLE=your-service-role-key

# Security (required โ€” generate with: openssl rand -hex 32)
ENCRYPTION_KEY=...
CSRF_SECRET=...

# AI provider (at least one)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Google Search (required for web search)
GOOGLE_SEARCH_KEY=...
GOOGLE_SEARCH_CX=...

3. Set up database

# Via Supabase CLI
npm install -g supabase
supabase login
supabase link --project-ref your-project-ref
supabase db push

# Or paste supabase/schema.sql into the Supabase SQL Editor

4. Run

Docker (recommended):

docker-compose up --build

Manual:

# Terminal 1 โ€” Frontend
npm run dev

# Terminal 2 โ€” Backend
cd backend && python main.py

# Terminal 3 โ€” AI Desktop VM (optional)
docker-compose -f docker-compose.ai-desktop.yml up --build

Open http://localhost:3000, sign in, start a chat, and give your agent a task.




Tech Stack

Layer Technologies
Frontend Next.js 15, React 19, TypeScript, Tailwind CSS, Radix UI, Zustand, Vercel AI SDK
Backend FastAPI, Python 3.10+, WebSockets, asyncio, uvicorn
AI Providers OpenAI, Anthropic, Google, Azure, xAI, Mistral, Perplexity, OpenRouter
Infrastructure Docker, Ubuntu 22.04 + XFCE, Chrome, Selenium, Supabase, Stripe
Desktop App Electron 40, Puppeteer-core, platform-native automation (Win32 / CoreGraphics / xdotool)



Electron Desktop App

A lightweight overlay that runs AI agent commands directly on your local machine instead of a remote VM.

  • Floating always-on-top pill UI with expanded chat panel
  • Platform-native automation (PowerShell/Win32 on Windows, CoreGraphics/osascript on macOS, xdotool on Linux)
  • Browser control via Puppeteer-core, shell execution, file operations
  • WebSocket bridge to backend with auto-reconnect
cd electron
npm install
npm run dev        # Development with hot reload
npm run package    # Build for current platform



Project Structure

โ”œโ”€โ”€ app/                    # Next.js routes & pages
โ”œโ”€โ”€ components/             # React components (UI, chat, prompts)
โ”œโ”€โ”€ lib/                    # Stores, providers, services, utilities
โ”œโ”€โ”€ backend/
โ”‚   โ””โ”€โ”€ app/
โ”‚       โ”œโ”€โ”€ api/routes/     # FastAPI endpoints
โ”‚       โ”œโ”€โ”€ services/       # Multi-agent executor, VM control, billing
โ”‚       โ”œโ”€โ”€ providers/      # AI provider integrations
โ”‚       โ””โ”€โ”€ core/           # Config, middleware, logging
โ”œโ”€โ”€ electron/
โ”‚   โ””โ”€โ”€ src/
โ”‚       โ”œโ”€โ”€ main/           # App lifecycle, IPC, automation modules
โ”‚       โ”œโ”€โ”€ preload/        # Context bridge API
โ”‚       โ””โ”€โ”€ renderer/       # React UI, stores, components
โ”œโ”€โ”€ docker/ai-desktop/      # Ubuntu VM container
โ””โ”€โ”€ supabase/               # Database schema



Contributing

  1. Fork the repo
  2. Create a branch: git checkout -b feature/your-feature
  3. Commit your changes
  4. Open a pull request

Bug reports and feature requests welcome in Issues.




Roadmap

  • Multi-VM parallel orchestration
  • Visual workflow builder
  • Agent marketplace & templates
  • Windows / macOS VM support
  • Plugin system for custom tools
  • Collaborative sessions
  • Voice control & video understanding



Responsible Use

This platform gives AI agents significant autonomy. Use it to automate repetitive tasks, testing, research, and content creation โ€” not to violate terms of service, spam, or scrape without permission. Always use isolated environments, respect robots.txt, and follow data protection laws.




License

Apache License 2.0 โ€” Copyright (c) 2025 Open Computer Use Contributors




โšก