Skip to content

ChrisPulman/CodexComputerRunMCPServer

Repository files navigation

Codex Computer Run MCP Server

Codex Computer Run MCP Server gives Codex and other MCP-capable agents direct control over a signed-in desktop session. It exposes focused tools for screenshots, mouse movement, clicks, scrolling, keyboard shortcuts, Unicode paste, cursor position, and visible window discovery, plus a bundled Codex Skill for safe desktop-use workflows.

It is implemented in C# on net10.0 using ModelContextProtocol 1.3.0. The current package and MCP manifest version is 1.1.0. The package targets plain net10.0 so it can be distributed as a .NET tool. Windows uses native Win32 APIs; Linux and macOS use best-effort command-backed adapters.

Quick Install

Click to install in your preferred environment:

VS Code - Install Codex Computer Run MCP VS Code Insiders - Install Codex Computer Run MCP Visual Studio - Install Codex Computer Run MCP

Note:

  • These install links are prepared for the intended NuGet package identity CP.CodexComputerRun.Mcp.Server.
  • If the latest package has not been published yet, use the manual source-build or published-executable configuration below.
  • Run the server from the signed-in desktop session you want to control. Windows desktop automation must be launched from Windows, not WSL.
  • Linux support expects xdotool for pointer and keyboard actions, xrandr as a display-geometry fallback, wmctrl or xdotool for window discovery, one of gnome-screenshot, grim, or ImageMagick import for screenshots, and one of wl-copy, xclip, or xsel for clipboard paste.
  • macOS support uses screencapture, pbcopy, and osascript; pointer actions require cliclick. Screen Recording and Accessibility permissions may be required by macOS.

What Codex Computer Run Helps With

Codex Computer Run gives an agent a minimal, fast desktop-control layer for:

  • Observe the full desktop via PNG screenshots.
  • Point the cursor at absolute virtual-screen coordinates.
  • Click left, right, or middle mouse buttons where supported, including repeated clicks. The built-in macOS adapter supports left and right clicks.
  • Scroll the wheel at the current cursor position or supplied coordinates.
  • Press single keys and keyboard shortcuts such as ctrl+l or ctrl+shift+escape.
  • Paste Unicode text through the platform clipboard paste path.
  • Inspect cursor position and visible top-level windows.

The server is designed for Codex computer-use workflows where the MCP client controls the active desktop.

Platform Support

Windows remains the primary implementation. Linux and macOS support keeps the same MCP tool surface but depends on external desktop commands that must be available inside the active graphical session.

Area Current behavior
Version 1.1.0
Target framework net10.0
Windows Native Win32 implementation with virtual-screen capture, SendInput, clipboard paste, cursor position, and visible top-level window enumeration
Linux Command-backed adapter using xdotool for pointer and keyboard input, xrandr for display-geometry fallback, wmctrl or xdotool for windows, screenshot command fallbacks, and clipboard command fallbacks
macOS Command-backed adapter using screencapture, pbcopy, osascript, and cliclick; macOS middle-click automation is not supported by the built-in adapter
Unsupported OS Deterministic unsupported-platform errors instead of silent no-ops
Session requirement Signed-in interactive desktop session
Transport MCP stdio

Do not run this server from WSL to control a Windows desktop. Building from WSL through Windows dotnet.exe can work, but the MCP server itself must be launched by a Windows MCP client or Windows PowerShell session.

Codex Protocol

When this server is active, agents should follow this operating protocol:

  1. Call screenshot first when visual context matters.
  2. Use cursor_position before relative manual reasoning about the current pointer location.
  3. Use list_windows to identify visible applications before focusing or interacting with them.
  4. Use move_mouse, click, scroll, press_key, hotkey, and type_text only when the intended foreground application is known.
  5. Prefer type_text for text entry because it uses Unicode clipboard paste and is faster and more reliable than simulated per-character typing.
  6. Keep screenshots small in conversation by setting include_image to false when only dimensions, platform metadata, or a saved path are needed.

Codex Skill

The repository and NuGet package include a Codex Skill at skills/codex-computer-run. The skill teaches Codex the observation-first workflow, safety rules, and exact MCP tool names for this server.

When the packaged server starts, it tries to install the skill into the current Codex installation if CODEX_HOME is set or %USERPROFILE%\.codex already exists. Existing skill files are not overwritten during automatic install.

Manual install from a globally installed tool:

dotnet tool install --global CP.CodexComputerRun.Mcp.Server --version 1.*
codex-computer-run-mcp-server --install-codex-skill

Manual install from source:

dotnet run --project .\src\CodexComputerRunMCPServer\CodexComputerRunMCPServer.csproj -- --install-codex-skill

Set CODEX_HOME first if Codex uses a non-default location:

$env:CODEX_HOME = "C:\Users\you\.codex"
codex-computer-run-mcp-server --install-codex-skill

To refresh an existing installed copy with the packaged skill files, add --force.

Use the skill in Codex by asking for it explicitly, for example:

Use $codex-computer-run to list visible windows, take a screenshot, and confirm the active desktop state.

Available MCP Tools

screenshot

Captures the current desktop as PNG.

Parameters:

  • path (optional) - output PNG path. If omitted, the image is returned in memory and no temporary file is created.
  • include_image (default: true) - include PNG image data in the MCP tool result.

Response: The first content block is JSON metadata with message, path, mimeType, platform, left, top, width, and height. When include_image is true, a PNG image block is also returned.

When to use: Use before interacting with the desktop, after UI changes, or when the agent needs visual confirmation.


move_mouse

Moves the cursor to absolute desktop coordinates.

Parameters:

  • x - absolute X coordinate.
  • y - absolute Y coordinate.
  • delay (optional) - seconds to wait after the action.

When to use: Use before a click or hover-sensitive action.


click

Clicks at the current cursor position or at supplied absolute coordinates.

Parameters:

  • x (optional) - absolute X coordinate.
  • y (optional) - absolute Y coordinate.
  • button (default: left) - left, right, or middle; middle is not supported by the built-in macOS adapter.
  • clicks (default: 1) - number of clicks.
  • interval (default: 0.08) - seconds between repeated clicks.
  • delay (optional) - seconds to wait after the action.

When to use: Use for buttons, menus, tabs, context menus, and desktop UI selection.


scroll

Scrolls the mouse wheel.

Parameters:

  • amount (default: -3) - wheel notches. Positive scrolls up, negative scrolls down.
  • x (optional) - absolute X coordinate to move to before scrolling.
  • y (optional) - absolute Y coordinate to move to before scrolling.
  • delay (optional) - seconds to wait after the action.

When to use: Use for lists, pages, combo boxes, and scrollable application panes.


press_key

Presses one keyboard key.

Parameters:

  • key - key name or single character, for example enter, tab, escape, f5, a, A, ?, or 1.
  • duration (default: 0.03) - seconds to hold the key.
  • delay (optional) - seconds to wait after the action.

When to use: Use for navigation keys, function keys, confirm/cancel actions, and single-character shortcuts.


hotkey

Presses a keyboard shortcut.

Parameters:

  • keys - shortcut text using +, comma, or space separators, for example ctrl+l, ctrl+shift+escape, or alt+tab.
  • delay (optional) - seconds to wait after the action.

When to use: Use for application shortcuts, browser address bar focus, task switching, command palettes, and system shortcuts.


type_text

Pastes Unicode text into the focused application using the platform clipboard paste path.

Parameters:

  • text - text to paste.
  • delay (optional) - seconds to wait after the action.

When to use: Use for text fields, editors, terminals, and any non-trivial text entry.


cursor_position

Returns the current desktop cursor position as JSON.

When to use: Use before or after mouse actions when the agent needs exact coordinates.


list_windows

Lists visible top-level desktop windows as JSON.

Parameters:

  • limit (default: 50) - maximum number of windows to return.

When to use: Use to identify visible applications and window titles before interacting with the desktop.

Performance And Integration Notes

  • Screenshot capture avoids temporary files when path is omitted.
  • include_image:false avoids PNG encoding unless a path is supplied.
  • Windows mouse and keyboard actions use batched SendInput calls instead of legacy per-event APIs.
  • Windows hotkey presses all keys down and releases them in reverse order in one batch.
  • Windows clipboard access retries briefly when another process has the clipboard open.
  • Windows visible window enumeration caches process names by PID during each call.
  • Startup enables per-monitor DPI awareness on Windows for correct coordinate and screenshot behavior on mixed-DPI displays.
  • Linux and macOS adapters fail with actionable dependency messages when required desktop commands are missing.
  • Release publishing enables single-file and ReadyToRun output for faster Codex startup.

Solution Layout

CodexComputerRunMCPServer.slnx          # Root solution wrapper for CI and local pack commands

src/
|-- CodexComputerRunMCPServer/          # MCP host, tools, service layer, and platform adapters
|-- CodexComputerRunMCPServer.Tests/    # TUnit unit and MCP integration tests
`-- CodexComputerRunMCPServer.slnx      # Source solution file

.mcp/
|-- server.json                         # MCP registry/package metadata
`-- install.md                          # Manual MCP install snippets

skills/
`-- codex-computer-run/                 # Codex Skill bundled into the NuGet package

Configuration

Lifecycle Safeguards

The server allows multiple Codex sessions to start their own MCP server process so tool discovery remains available in each session. Input-changing tools still coordinate desktop control by taking an exclusive, renewable lease under the platform local application-data folder:

CodexComputerRunMCPServer\control.lock

On Windows this is normally %LOCALAPPDATA%\CodexComputerRunMCPServer\control.lock. On Linux and macOS it follows .NET's local application-data location for the signed-in user, falling back to the temp directory if no local application-data path is available.

The control lease is acquired by move_mouse, click, scroll, press_key, hotkey, and type_text. If another Codex session currently owns the lease, the tool call fails with a busy message instead of allowing simultaneous mouse, keyboard, or clipboard input. Observation tools (screenshot, cursor_position, and list_windows) remain available from every session.

After the latest control action, the owning process keeps the lease briefly so follow-up clicks or keystrokes from the same session are not interleaved with another session. The lease is also released immediately when the owning MCP process exits.

Idle shutdown is disabled by default so long-lived Codex sessions can call the MCP tools later without finding a closed stdio transport. If you explicitly enable idle shutdown, every tool call updates activity state and active calls are never stopped mid-invocation.

Optional environment overrides:

Variable Default Detail
CODEX_COMPUTER_RUN_CONTROL_LOCK true Set false to disable cross-session desktop-control coordination.
CODEX_COMPUTER_RUN_CONTROL_LEASE_SECONDS 60 Seconds the owning session keeps desktop control after the latest input-changing action. Set 0 to release immediately after each action.
CODEX_COMPUTER_RUN_IDLE_SHUTDOWN false Set true to enable idle shutdown.
CODEX_COMPUTER_RUN_IDLE_TIMEOUT_SECONDS 300 Seconds without tool activity before shutdown when idle shutdown is enabled. Values 0 or lower disable idle shutdown.
CODEX_COMPUTER_RUN_IDLE_CHECK_INTERVAL_SECONDS 10 Seconds between idle checks.

Fast Codex Desktop Configuration

After publishing, Codex can launch the optimized executable directly. Use the runtime identifier that matches the OS running the signed-in desktop session.

Windows:

[mcp_servers.codex-computer-run]
command = "PathTo\\CodexComputerRunMCPServer\\artifacts\\publish\\win-x64\\CodexComputerRunMCPServer.exe"
args = []

Linux or macOS:

[mcp_servers.codex-computer-run]
command = "/path/to/CodexComputerRunMCPServer/artifacts/publish/linux-x64/CodexComputerRunMCPServer"
args = []

The checked-in .codex/config.toml uses the Windows fast published-executable path for this workspace.

Manual MCP Client Configuration

Published executable:

Windows:

{
  "mcpServers": {
    "codex-computer-run": {
      "command": "PathTo\\CodexComputerRunMCPServer\\artifacts\\publish\\win-x64\\CodexComputerRunMCPServer.exe",
      "args": []
    }
  }
}

Linux or macOS:

{
  "mcpServers": {
    "codex-computer-run": {
      "command": "/path/to/CodexComputerRunMCPServer/artifacts/publish/linux-x64/CodexComputerRunMCPServer",
      "args": []
    }
  }
}

NuGet package through dnx:

{
  "mcpServers": {
    "codex-computer-run": {
      "command": "dnx",
      "args": [
        "CP.CodexComputerRun.Mcp.Server@1.*",
        "--yes"
      ]
    }
  }
}

Development source run:

{
  "mcpServers": {
    "codex-computer-run": {
      "command": "dotnet",
      "args": [
        "run",
        "--project",
        "PathTo\\CodexComputerRunMCPServer\\src\\CodexComputerRunMCPServer\\CodexComputerRunMCPServer.csproj",
        "--configuration",
        "Release",
        "--no-launch-profile"
      ]
    }
  }
}

Use forward slashes in the project path on Linux and macOS.

Are mcp-config.development.windows.json And mcp-config.windows.json Required?

No. They are optional convenience snippets for MCP clients that import JSON config files manually.

Required or primary MCP/Codex files are:

  • .mcp/server.json for MCP package metadata.
  • .mcp/install.md for install notes.
  • skills/codex-computer-run for the bundled Codex Skill.
  • .codex/config.toml for this local Codex workspace.
  • .mcp.json only if your client reads repository-local MCP JSON configuration.

Build

Windows PowerShell:

dotnet restore .\CodexComputerRunMCPServer.slnx
dotnet build .\CodexComputerRunMCPServer.slnx --configuration Release

Linux or macOS:

dotnet restore ./CodexComputerRunMCPServer.slnx
dotnet build ./CodexComputerRunMCPServer.slnx --configuration Release

If a running MCP server locks the default bin\Release output, build to a verification output path:

dotnet build .\CodexComputerRunMCPServer.slnx --configuration Release --no-restore /p:OutputPath=D:\Projects\Github\chrispulman\CodexComputerRunMCPServer\artifacts\verify\bin\

Test

Windows PowerShell:

dotnet test .\src\CodexComputerRunMCPServer.Tests\CodexComputerRunMCPServer.Tests.csproj --configuration Release

Linux or macOS:

dotnet test ./src/CodexComputerRunMCPServer.Tests/CodexComputerRunMCPServer.Tests.csproj --configuration Release

Coverage with TUnit/Microsoft Testing Platform:

dotnet test .\src\CodexComputerRunMCPServer.Tests\CodexComputerRunMCPServer.Tests.csproj --configuration Release -- --coverage --coverage-output coverage.cobertura.xml --coverage-output-format cobertura --results-directory .\artifacts\test-results

Current verification:

  • 61 TUnit tests passed.
  • Coverage: 77.65% line coverage, 48.37% branch coverage for testable code.
  • Repository and package verification confirm skills/codex-computer-run/SKILL.md and skills/codex-computer-run/agents/openai.yaml are bundled.
  • Native Win32 P/Invoke shims are excluded from coverage and verified through the service boundary plus live MCP tool discovery.

Publish

The helper script name is historical; it now accepts Windows, Linux, and macOS runtime identifiers.

.\scripts\publish-windows.ps1 -Runtime win-x64
.\scripts\publish-windows.ps1 -Runtime linux-x64
.\scripts\publish-windows.ps1 -Runtime osx-arm64

Direct command:

dotnet publish .\src\CodexComputerRunMCPServer\CodexComputerRunMCPServer.csproj --configuration Release --runtime win-x64 --self-contained false --output .\artifacts\publish\win-x64

MCP Verification

The TUnit suite verifies MCP metadata, the bundled Codex Skill, platform adapters, lifecycle behavior, and the static tool facade. The published win-x64 executable was also validated with an MCP stdio initialize and tools/list handshake. The server reported all 9 tools:

scroll, hotkey, type_text, screenshot, list_windows, click, move_mouse, press_key, cursor_position

Live Linux and macOS desktop behavior depends on the active graphical session, installed command dependencies, and OS-level permissions.

Example Prompts For Your AI Assistant

Once configured, you can ask things like:

  • "Call screenshot and describe the active window."
  • "List visible windows and tell me which browser tabs or apps are available."
  • "Move the mouse to x=400, y=300, click, then take another screenshot."
  • "Press ctrl+l, type https://example.com, then press enter."
  • "Paste this text into the focused editor using type_text."
  • "Scroll down 5 notches and confirm what changed on screen."
  • "Get the cursor position before clicking."

Safety Notes

This server controls the active desktop. Mouse, keyboard, and clipboard actions affect the currently focused application. Use it only in a trusted desktop session and pair destructive UI actions with screenshots or window checks first.

About

An MCP Server to enable execution of Windows applications from Codex

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors