Skip to content

Support end-to-end reload workflow + codex configurator#2

Merged
dsarno merged 9 commits into
mainfrom
codex-reload-end-to-end
Apr 13, 2026
Merged

Support end-to-end reload workflow + codex configurator#2
dsarno merged 9 commits into
mainfrom
codex-reload-end-to-end

Conversation

@dsarno

@dsarno dsarno commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR wires the reload workflow all the way through for local development:

  • adds a shared runtime/handler layer for editor, scene, node, project, client, session, and testing flows
  • makes reload_plugin robust against fast reconnect races by detecting the replacement session directly
  • adds a real reloadable ASGI entrypoint for HTTP transports via godot_ai.asgi
  • switches python -m godot_ai --transport streamable-http --port 8000 --reload onto Uvicorn's supported import-string/factory reload path
  • adds first-class Codex configurator support in client_configurator.gd, including configure/status/remove handling for ~/.codex/config.toml
  • updates the Godot plugin dock and dev-server workflow so plugin reload, server reload, and client configuration can be exercised together
  • adds regression coverage for runtime handlers, CLI reload wiring, session waiting, client flows, and MCP tool reload behavior

Root Cause

There were two separate failure modes behind the expected reload behavior:

  1. Python dev reload was effectively a no-op. We were passing reload=True through FastMCP's in-memory app path, but Uvicorn only activates its reload supervisor when the app is provided as an import string/factory.
  2. The MCP reload_plugin wrapper could miss a very fast reconnect. If the replacement Godot session registered before the waiter was installed, the tool would hang until timeout even though the plugin had already reloaded.

This PR also folds in the related Codex client configuration work that was needed to exercise the end-to-end workflow from the editor side.

Codex Configurator

  • adds codex to the supported client set alongside Claude Code and Antigravity
  • manages the godot-ai MCP server entry in ~/.codex/config.toml
  • supports configure, status, and remove flows for Codex
  • routes the Python-side client tools through the new runtime/handler layer instead of calling the lifespan context directly

Impact

  • Python edits now trigger a real server restart in dev --reload mode.
  • Native reload_plugin now returns reliably after the plugin reconnects.
  • Codex can be configured from the plugin-side client configurator flow as part of the same workflow.
  • The plugin-side dev workflow and client configurator flow can be reviewed together instead of requiring manual restarts or hand-waving around timeouts.

Validation

Automated:

  • pytest -q tests/unit/test_cli_reload.py tests/unit/test_runtime_handlers.py tests/unit/test_session_registry.py tests/unit/test_resources.py tests/integration/test_mcp_tools.py

Live verification:

  • started the external MCP server in --reload mode and confirmed Uvicorn launched a real WatchFiles reloader
  • made a temporary Python marker change and confirmed WatchFiles restarted the worker and native editor_state reflected the new server-side marker
  • made a temporary plugin marker change and confirmed native reload_plugin returned a new session id and logs_read showed the updated plugin-side marker
  • reverted both temporary markers and confirmed the live editor returned to the normal state

@dsarno dsarno changed the title [codex] Support end-to-end reload workflow Support end-to-end reload workflow + codex configurator Apr 13, 2026
@codecov

codecov Bot commented Apr 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.28483% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/godot_ai/resources/editor.py 60.00% 4 Missing ⚠️
src/godot_ai/resources/scenes.py 60.00% 4 Missing ⚠️
src/godot_ai/sessions/registry.py 90.90% 2 Missing ⚠️
src/godot_ai/handlers/editor.py 97.56% 1 Missing ⚠️
src/godot_ai/runtime/direct.py 97.05% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Replace polling loop in reload_plugin with event-driven wait_for_session
  (race-safe pre-check + Future-based wait instead of 50ms sleep loop)
- Clean up session waiter futures on timeout in SessionRegistry
- Remove [reload-smoke] debug tag from plugin log
- Remove TOCTOU dir existence check in Codex configurator
- Add handler unit tests for all domains (testing, client, session, scene,
  node, editor, project) — closes coverage from 85% to 97%
- Add integration tests for client, session, testing tools and resource reads
- Add ASGI error path tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes an end-to-end reload workflow for local development by introducing a shared runtime/handler layer, adding a reloadable ASGI entrypoint for HTTP transports (to enable real Uvicorn --reload), wiring robust reload_plugin behavior, and expanding the Godot plugin’s client configurator to support Codex.

Changes:

  • Introduces shared handlers/* plus runtime/* to unify tool/resource logic across editor/scene/node/project/session/testing flows.
  • Adds godot_ai.asgi and updates CLI --reload behavior to use Uvicorn’s factory/import-string reload path for HTTP transports.
  • Adds plugin-side reload support + dev-server controls, and adds Codex client configuration support (~/.codex/config.toml) with new regression tests.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/unit/test_session_registry.py Adds async coverage for SessionRegistry.wait_for_session() behavior.
tests/unit/test_runtime_handlers.py Adds unit tests for shared handlers + DirectRuntime adapter (including reload behavior).
tests/unit/test_resources.py Makes resource registration test compatible with multiple FastMCP APIs (list_resources vs get_resources).
tests/unit/test_cli_reload.py Verifies ASGI factory + CLI reload wiring uses Uvicorn factory path and reload dirs.
tests/integration/test_mcp_tools.py Adds integration coverage for reload_plugin reconnect cycle; minor formatting updates.
src/godot_ai/tools/testing.py Switches tool implementations to shared testing handlers via DirectRuntime.
src/godot_ai/tools/session.py Switches session tools to shared session handlers via DirectRuntime.
src/godot_ai/tools/scene.py Switches scene tools to shared scene handlers via DirectRuntime.
src/godot_ai/tools/project.py Switches project tools to shared project handlers via DirectRuntime.
src/godot_ai/tools/node.py Switches node tools to shared node handlers via DirectRuntime.
src/godot_ai/tools/editor.py Switches editor tools to shared editor handlers and adds reload_plugin tool.
src/godot_ai/tools/client.py Switches client tools to shared client handlers; updates docs to include Codex.
src/godot_ai/sessions/registry.py Adds session waiters + wait_for_session() for reconnect/reload workflows.
src/godot_ai/runtime/interface.py Defines a Runtime protocol consumed by shared handlers.
src/godot_ai/runtime/direct.py Adds in-process DirectRuntime adapter for FastMCP lifespan context.
src/godot_ai/runtime/init.py Introduces runtime package.
src/godot_ai/resources/sessions.py Routes sessions resource through shared session handlers/runtime adapter.
src/godot_ai/resources/scenes.py Routes scene resources through shared scene handlers/runtime adapter.
src/godot_ai/resources/project.py Routes project resources through shared project handlers/runtime adapter.
src/godot_ai/resources/editor.py Routes editor resources through shared editor handlers/runtime adapter.
src/godot_ai/handlers/testing.py Adds shared handler implementations for testing commands.
src/godot_ai/handlers/session.py Adds shared handler implementations for session commands/resources.
src/godot_ai/handlers/scene.py Adds shared handler implementations for scene commands/resources.
src/godot_ai/handlers/project.py Adds shared handler implementations for project commands/resources.
src/godot_ai/handlers/node.py Adds shared handler implementations for node commands.
src/godot_ai/handlers/editor.py Adds shared handler implementations for editor commands/resources incl. reload workflow.
src/godot_ai/handlers/client.py Adds shared handler implementations for client configuration/status commands.
src/godot_ai/handlers/init.py Introduces handlers package.
src/godot_ai/asgi.py Adds reloadable ASGI app factory + run_with_reload() Uvicorn runner.
src/godot_ai/init.py Updates CLI entrypoint to support reload runner and injectable argv for testing.
plugin/addons/godot_ai/plugin.gd Registers plugin reload command; adds dev-server start/stop helpers and passes plugin into dock.
plugin/addons/godot_ai/mcp_dock.gd Adds dev-server start/stop UI button and wires it to plugin methods.
plugin/addons/godot_ai/handlers/editor_handler.gd Adds plugin-side reload_plugin command handler.
plugin/addons/godot_ai/client_configurator.gd Adds Codex client support with configure/status/remove flows via ~/.codex/config.toml.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugin/addons/godot_ai/plugin.gd Outdated
if _plugin.is_dev_server_running():
_plugin.stop_dev_server()
else:
_plugin.start_dev_server()

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dock calls _plugin.start_dev_server() directly, but start_dev_server() contains an await and therefore returns a coroutine/function state unless awaited. This likely prevents the dev server from actually starting. Either await the call here (and make this handler async) or refactor start_dev_server() to be non-yielding.

Suggested change
_plugin.start_dev_server()
await _plugin.start_dev_server()

Copilot uses AI. Check for mistakes.
Comment on lines +52 to 56
_dock.setup(_connection, _log_buffer, self)
add_control_to_dock(DOCK_SLOT_RIGHT_BL, _dock)

_log_buffer.log("plugin loaded")

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message "plugin loaded [reload-smoke]" looks like a temporary smoke-test marker, but it’s left in the committed changes and will permanently alter user-facing logs. If this was only for manual validation, it should be reverted back to the normal message before merge (also aligns with the PR description mentioning temporary markers were reverted).

Copilot uses AI. Check for mistakes.
Comment on lines +251 to +253
static func _get_codex_config_path() -> String:
return OS.get_environment("HOME").path_join(".codex/config.toml")

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_codex_config_path() uses OS.get_environment("HOME") unconditionally. On Windows this env var is often unset, which would produce a relative path like .codex/config.toml and read/write the wrong location. Consider using a Windows fallback (e.g., USERPROFILE) and/or validating that the resolved home directory is non-empty before writing.

Copilot uses AI. Check for mistakes.
Comment thread src/godot_ai/sessions/registry.py Outdated
Comment on lines +26 to +27
def from_context(cls, ctx: Context) -> DirectRuntime:
app = ctx.fastmcp._lifespan_result

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DirectRuntime.from_context() reaches into ctx.fastmcp._lifespan_result, which is a private attribute (leading underscore) and can break with FastMCP upgrades. Prefer an official Context API if available (e.g. ctx.lifespan_context) or add a compatibility fallback (try public attributes first, then private) to avoid a hard runtime failure for all tools/resources.

Suggested change
def from_context(cls, ctx: Context) -> DirectRuntime:
app = ctx.fastmcp._lifespan_result
def _get_app_context(cls, ctx: Context) -> SupportsDirectRuntime | None:
app = getattr(ctx, "lifespan_context", None)
if app is not None:
return app
fastmcp = getattr(ctx, "fastmcp", None)
if fastmcp is None:
return None
app = getattr(fastmcp, "lifespan_context", None)
if app is not None:
return app
return getattr(fastmcp, "_lifespan_result", None)
@classmethod
def from_context(cls, ctx: Context) -> DirectRuntime:
app = cls._get_app_context(ctx)

Copilot uses AI. Check for mistakes.
Comment on lines +92 to +100
class TestWaitForSession:
async def test_resolves_immediately_if_session_already_registered(self):
"""If a session registers before wait starts, it resolves on the next one."""
reg = SessionRegistry()
s = _make_session("new-1")

async def register_soon():
await asyncio.sleep(0.05)
reg.register(s)

Copilot AI Apr 13, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test docstring says the session registers "before wait starts", but the test schedules registration after a sleep while wait_for_session() is already pending. Consider updating the docstring to reflect the actual scenario being validated (race where the waiter is installed before registration).

Copilot uses AI. Check for mistakes.
Comment thread plugin/addons/godot_ai/plugin.gd Outdated
dsarno and others added 5 commits April 12, 2026 17:30
- Remove await from start_dev_server, use timer callback instead so the
  dock call site does not silently drop the coroutine
- Escape %a in Windows stop_dev_server cmd string to avoid GDScript
  format operator treating it as a specifier
- Use USERPROFILE fallback for Codex config path on Windows
- Switch get_event_loop() to get_running_loop() in wait_for_session and
  move waiter cleanup to finally block (covers cancellation too)
- Fix misleading test docstring

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runs after the GDScript test suite on all three OSes. The script
calls reload_plugin via MCP HTTP, waits for the new session, then
calls editor_state to verify the plugin is alive post-reload. This
exercises the real EditorInterface.set_plugin_enabled path that
can't be tested from within the plugin's own test runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nuity

- Before reload: create a MeshInstance3D (ReloadTestCube) via old plugin
- After reload: check logs_read starts with 'plugin loaded' (proves
  handlers/log buffer were fully rebuilt, not reused)
- After reload: find ReloadTestCube via new plugin (proves the scene
  tree survived the plugin disable/enable cycle)
- Extract mcp_call helper to reduce SSE parsing repetition

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ping

The mcp_call helper was swallowing error output into the captured
variable, then set -e killed the script before it could be printed.
Now errors go to stderr and mcp_call_or_die prints diagnostics.
Also handles isError tool responses and fixes default args brace issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…st counts

CLAUDE.md:
- Add streamable-http transport to architecture
- Document handler/runtime abstraction pattern
- Update 'adding a new tool' steps for handler layer
- Add Codex to client configuration list
- Document reload_plugin tool and ASGI reload path
- Update test count: 81 -> 140

implementation-plan.md:
- Mark CI setup tasks as done (Tier 1+2, Codecov, badges)
- Add reload workflow items as completed
- Add Codex client and handler/runtime layer as completed
- Update test count: 125 -> 184
- Add project_handler.gd to plugin file listing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@dsarno dsarno marked this pull request as ready for review April 13, 2026 19:03
@dsarno dsarno merged commit 17d2544 into main Apr 13, 2026
10 checks passed
@dsarno dsarno deleted the codex-reload-end-to-end branch April 13, 2026 19:03
dsarno pushed a commit that referenced this pull request May 5, 2026
…uplicate-ID reject

Three small audit-v2 fixes in src/godot_ai/transport/websocket.py
covering #343 sub-findings #2, #4, and #5.

- #4 (P1): replace hardcoded `e.errno == 48` (macOS-only) with
  `errno.EADDRINUSE` so Linux (98) and Windows (10048) also hit
  the friendly "port already in use" branch instead of crashing
  the WS lifespan with an unhandled OSError.

- #5 (P2): wrap `send_command`'s `ws.send` + `wait_for` in a
  try/finally that always pops `_pending`. Pre-fix, a `ws.send`
  raise (ConnectionClosed mid-send, transport error) leaked the
  Future entry forever; under churn the dict grew unbounded.
  Happy path still pops via the response receiver, so the finally
  pop is a no-op there.

- #2 (P1): reject a second handshake whose session_id is already
  registered (close code 4001), instead of silently overwriting
  the routing map. session_id format is `<slug>@<4hex>` — 16 bits
  of suffix is locally guessable, so without this any local peer
  could hijack an active session by impersonating its ID.
  Legitimate plugin reconnect after `editor_reload_plugin` first
  triggers ConnectionClosed -> unregister, so the new connect
  still lands cleanly.

Tests:
- tests/unit/test_websocket_server_lifecycle.py — parametrized
  errno coverage (macos 48, linux 98, windows 10048; one runs per
  CI host) plus a non-EADDRINUSE OSError propagation case.
- tests/integration/test_websocket.py — TestDuplicateHandshake
  pins the reject + reconnect-after-clean-disconnect paths;
  TestPendingFutureCleanup pins the timeout-pop and
  send-failure-pop behaviors.

Live-smoked against a real Godot 4.6 editor: full GDScript suite
1037/1040 passed, duplicate-handshake rejected with close 4001,
original session unaffected.

https://claude.ai/code/session_016ijmCD5S6QfwJGJcc5Wirp
dsarno pushed a commit that referenced this pull request May 5, 2026
…ests

Per #316 acceptance criterion #2: after 12+ fix attempts the Camera2D
`make_current` → `Viewport.camera_2d` slot propagation still occasionally
lags on macOS headless. The handler-side logical-current bookkeeping (#311)
stays correct in that race; only the engine-side viewport slot lags. The
test polls 200 ms (#341 widened diagnostics) but observed lag has been up
to ~600 ms in the wild — beyond any reasonable wait budget for a unit test.

Add `_skip_if_macos_engine_lag(cam, expected, report, label)`:
- if the wait already settled → noop, caller continues
- if not on macOS headless → noop, caller's hard assertion fails normally
- if handler logical state disagrees with expectation → noop (real regression,
  caller's hard assertion fires with the diagnostic)
- only when on macOS headless AND handler state agrees → skip() with the full
  diagnostic message so the next investigator still sees the divergence

Wired into `test_create_with_make_current_unmarks_sibling` and the four poll
sites in `test_configure_current_sibling_unmark_single_undo`. Linux / Windows
behavior is unchanged — those platforms still hard-fail on a missed propagation.

This intentionally does NOT close the underlying race: it converts the known
upstream-flavor flake into a soft signal so CI stops reporting noise on it.
The diagnostic-rich skip message keeps the failure visible enough that any
NEW divergence (e.g. a real handler regression) still surfaces.

Refs: #316
dsarno added a commit that referenced this pull request May 5, 2026
…ests (#372)

* Skip persistent macOS-headless engine-state lag in #316 flake-prone tests

Per #316 acceptance criterion #2: after 12+ fix attempts the Camera2D
`make_current` → `Viewport.camera_2d` slot propagation still occasionally
lags on macOS headless. The handler-side logical-current bookkeeping (#311)
stays correct in that race; only the engine-side viewport slot lags. The
test polls 200 ms (#341 widened diagnostics) but observed lag has been up
to ~600 ms in the wild — beyond any reasonable wait budget for a unit test.

Add `_skip_if_macos_engine_lag(cam, expected, report, label)`:
- if the wait already settled → noop, caller continues
- if not on macOS headless → noop, caller's hard assertion fails normally
- if handler logical state disagrees with expectation → noop (real regression,
  caller's hard assertion fires with the diagnostic)
- only when on macOS headless AND handler state agrees → skip() with the full
  diagnostic message so the next investigator still sees the divergence

Wired into `test_create_with_make_current_unmarks_sibling` and the four poll
sites in `test_configure_current_sibling_unmark_single_undo`. Linux / Windows
behavior is unchanged — those platforms still hard-fail on a missed propagation.

This intentionally does NOT close the underlying race: it converts the known
upstream-flavor flake into a soft signal so CI stops reporting noise on it.
The diagnostic-rich skip message keeps the failure visible enough that any
NEW divergence (e.g. a real handler regression) still surfaces.

Refs: #316

* ci: trigger workflow after PR base change to beta

* Tighten #316 macOS gate to require explicit logical marker (PR review)

Copilot review on PR #372: `_handler_logical_current_matches` was using
`camera_get(path).current` which routes through `_resolve_current`, which
falls back to `_is_effective_current` (raw engine state) when no logical
marker is set. For `expected=false` that meant the gate could return "matches"
when the handler had lost track entirely — masking a real regression as a
documented engine-state lag.

In practice the case is narrow (`is_current()` and `viewport.get_camera_2d()
== cam` are equivalent absent a stale-viewport divergence, which #316
diagnostics ruled out via `cam_viewport_matches_scene=true`), but the gate
must not rely on that abstractly.

- `camera_handler.gd`: add `peek_logical_current(scene_root, type_str)` —
  public introspection that returns the logical-current camera (or null
  when no marker), bypassing the engine fallback.
- `test_camera.gd`: rewrite `_handler_logical_current_matches` to consult
  `peek_logical_current` directly. Skip only fires when:
  - expected=true: marker points at this camera
  - expected=false: marker exists for the type and points elsewhere
  No marker → refuse to skip in either direction → real regression fails.

Refs: #316

---------

Co-authored-by: Claude <noreply@anthropic.com>
dsarno added a commit that referenced this pull request May 5, 2026
…#375)

* Reject non-loopback Host/Origin on WS + HTTP transports (audit-v2 #1)

Closes #345 — refs #343 (audit-v2 umbrella).

Both transports bound to 127.0.0.1, but a malicious browser tab could
mount **DNS rebinding**: resolve attacker.example.com to 127.0.0.1,
then `new WebSocket("ws://attacker.example.com:9500")`. The request
lands on our loopback socket carrying a non-localhost Host (and
Origin), the WS handshake registered the peer as a session, and any
MCP tool became drivable from a foreign origin — including write
tools.

The streamable-HTTP transport on :8000 had the same gap, including
`/godot-ai/status` (comment-marked "small unauthenticated probe").

Per the issue's "Fix shape": **strict Host/Origin allowlist in a
Starlette middleware ahead of FastMCP** plus a matching `process_request`
hook on the WebSocket server.

`src/godot_ai/transport/origin_guard.py` (new):

- `is_allowed_host` — accepts `127.0.0.1`, `localhost`, `[::1]` with
  optional `:port`; case-insensitive; bare `::1` rejected (RFC 7230
  requires bracketed IPv6 in Host).
- `is_allowed_origin` — None / empty / `null` accepted (native clients
  omit Origin; sandboxed/file:// emit `null`); otherwise must parse to
  a URL whose hostname is loopback. Schemes outside `http/https/ws/wss`
  rejected.
- `LocalhostOnlyHTTPMiddleware` — ASGI middleware. Rejects on first
  failure with HTTP 403; passes lifespan scopes through; handles
  duplicate-Host smuggling shapes (fail closed when the same header
  appears more than once).
- `make_websocket_request_guard()` — `process_request` hook for
  `websockets.asyncio.server.serve(...)` mirroring the same logic.
  Uses `headers.get_all` so a smuggled duplicate Host fails closed
  rather than tripping `MultipleValuesError` and surfacing as 500.

Wiring:

- `transport/websocket.py`: pass `process_request=` to `serve()`.
- `server.py`: outermost wrap on the HTTP app, applied to every
  HTTP transport (`http`, `streamable-http`, `sse`) so `/godot-ai/status`
  and the FastMCP endpoints are guarded uniformly.

Native clients keep working: the Godot plugin's WebSocketPeer sends
`Host: 127.0.0.1:<port>` and no Origin. Verified by smoke against a
real WebSocketServer (loopback connect → `handshake_ack`; rebound
Origin → 403) and against the streamable-HTTP transport
(`/godot-ai/status`: loopback → 200, `Host: attacker.example.com` → 403,
loopback Host + browser Origin → 403).

Tests:

- `tests/unit/test_origin_guard.py` (44 tests):
  parametrized helper coverage (loopback / sneaky-substring / IP /
  malformed / multiple smuggled values), middleware behavior, lifespan
  passthrough, `state` `__getattr__` passthrough.
- `tests/integration/test_websocket.py::TestDnsRebindingGuard` (5
  tests): live `websockets` server + client. Loopback succeeds,
  non-loopback Host returns 403, non-loopback Origin returns 403,
  loopback-shaped Origin succeeds, rejected request never registers
  a session.
- `tests/unit/test_asgi_session_diagnostics.py`: structural assertions
  updated for the new outer wrap; FastMCP `state` lookup now traverses
  one extra layer.
- `tests/unit/test_server_status.py`: `TestClient(base_url=...)` set
  to a loopback host so the request passes the new guard.

Test plan:
- ruff check + format: clean
- pytest -q: 825 passed (+45 new)
- script/ci-check-gdscript: clean (no GDScript touched)
- Live smoke: real `GodotWebSocketServer` and real streamable-HTTP
  app, loopback succeeds, non-loopback Host/Origin → 403.

Note: PR #366 (audit-v2 #2, session-hijack) closes the
duplicate-handshake takeover. This PR closes the rebinding-from-browser
path. Together they harden the WebSocket transport at both layers
(who can connect, and what they can do once they have).

* Simplify: extract evaluate_loopback so WS + HTTP guards can't drift

Both the WebSocket ``process_request`` hook and ``LocalhostOnlyHTTPMiddleware``
implemented the same 3-step rule (count duplicate headers → check
``is_allowed_host`` → check ``is_allowed_origin`` → 403) twice, with
slightly different intermediate state. Funnel both through one
``evaluate_loopback(hosts, origins) -> bool`` so a regression in one
transport cannot accidentally diverge from the other.

Drive-by:

- Decode FORBIDDEN_BODY once at module load (FORBIDDEN_BODY_TEXT) so
  the WS guard doesn't decode the same bytes per rejection.
- Drop redundant ``parsed.scheme.lower()`` — ``urlsplit`` already
  normalises the scheme per RFC 3986.

No behavior change. 825 tests pass.

* Address Copilot review: reject Origin: null + Sec-Fetch-Site cross-origin

Two browser-side liveness/bypass shapes that the first cut of the
loopback guard let through:

1. **``Origin: null``** (Copilot, origin_guard.py:85). Sandboxed
   ``<iframe sandbox>`` and downloaded ``file://`` pages serialize
   their origin as ``null``. The original guard accepted this on the
   theory that file:// /sandboxed contexts are "legitimate"; in
   practice they're exactly the bypass an attacker would use to bridge
   a foreign origin onto our loopback socket. **Now rejected.** Native
   clients never produce ``null`` (they omit Origin entirely), so this
   tightening is invisible to the Godot plugin / FastMCP CLI / curl.

2. **Cross-origin no-cors subresource probes** (Copilot, server.py:79).
   `<img src="http://127.0.0.1:9500/godot-ai/status">` from
   ``https://attacker.example.com`` arrives with a loopback ``Host``
   and *no* ``Origin`` (browsers omit Origin for ``no-cors`` GETs of
   ``<img>`` / ``<script>`` / ``<link>``), so the original Host/Origin
   gate let it through and the page could use the 200/403 outcome as
   a liveness oracle. Browsers stamp every HTTP request with
   ``Sec-Fetch-Site`` (``cross-site`` / ``same-site`` / ``same-origin``
   / ``none``). Native non-browser clients never send it. **The guard
   now rejects ``cross-site`` and ``same-site``** while still allowing
   ``none`` (top-level navigation: user typed URL, bookmark) and
   ``same-origin`` (loopback page fetching its own server). Missing →
   allow (native client), preserving the existing flow.

Drive-by: trailing-dot loopback (``Host: localhost.``) now accepted.
RFC 1034 valid FQDN syntax that browsers and curl can preserve through
to the Host header — friction trap if rejected, no security
implication either way.

`evaluate_loopback` now takes an optional ``sec_fetch_sites`` list so
both transports run the new policy through the same single helper —
preserves the audit-v2 invariant that the WS hook and HTTP middleware
cannot drift.

Tests:

- New parametrized helper coverage for ``is_allowed_sec_fetch_site``
  (friendly + foreign + case-insensitive + whitespace), trailing-dot
  hosts, and explicit ``Origin: null`` rejection.
- Two new ``evaluate_loopback`` cases pinning the cross-site rejection
  with no-Origin (the exact Copilot shape) and ``Origin: null`` with
  any Sec-Fetch-Site value.
- Three new ``LocalhostOnlyHTTPMiddleware`` middleware cases:
  ``Origin: null`` rejected, cross-site subresource rejected, top-level
  navigation accepted.
- Three new live-WS integration cases pinning the same rules end-to-end
  through the websockets ``process_request`` hook, plus a
  bracketed-IPv6 ``http://[::1]:9500`` symmetry test for the WS path.

Live smoke: native loopback connect → ``handshake_ack`` ✓; ``Origin: null``
WS connect → 403 ✓; ``Sec-Fetch-Site: cross-site`` WS connect → 403 ✓;
``Origin: null`` HTTP probe → 403 ✓; cross-site HTTP probe → 403 ✓.

---------

Co-authored-by: Claude <noreply@anthropic.com>
dsarno pushed a commit that referenced this pull request May 5, 2026
…ake reject

Two small audit-v2 fixes in src/godot_ai/transport/websocket.py
covering #343 sub-findings #2 and #5. (Sub-finding #4 — errno.EADDRINUSE
portability — landed separately via PR #373.)

- #5 (P2): wrap `send_command`'s `ws.send` + `wait_for` in a
  try/finally that always pops `_pending`. Pre-fix, a `ws.send`
  raise (ConnectionClosed mid-send, transport error) leaked the
  Future entry forever; under churn the dict grew unbounded.
  Happy path still pops via the response receiver, so the finally
  pop is a no-op there.

- #2 (P1): reject a second handshake whose session_id is already
  registered (close code 4001), instead of silently overwriting
  the routing map. session_id format is `<slug>@<4hex>` — 16 bits
  of suffix is locally guessable, so without this any local peer
  could hijack an active session by impersonating its ID.
  Legitimate plugin reconnect after `editor_reload_plugin` first
  triggers ConnectionClosed -> unregister, so the new connect
  still lands cleanly.

Tests:
- tests/integration/test_websocket.py — TestDuplicateHandshake
  pins the reject + reconnect-after-clean-disconnect paths;
  TestPendingFutureCleanup pins the timeout-pop and
  send-failure-pop behaviors.

Live-smoked against a real Godot 4.6 editor: full GDScript suite
1037/1040 passed, duplicate-handshake rejected with close 4001,
original session unaffected.

https://claude.ai/code/session_016ijmCD5S6QfwJGJcc5Wirp
dsarno pushed a commit that referenced this pull request May 5, 2026
…ake reject

Two small audit-v2 fixes in src/godot_ai/transport/websocket.py
covering #343 sub-findings #2 and #5. (Sub-finding #4 — errno.EADDRINUSE
portability — landed separately via PR #373.)

- #5 (P2): wrap `send_command`'s `ws.send` + `wait_for` in a
  try/finally that always pops `_pending`. Pre-fix, a `ws.send`
  raise (ConnectionClosed mid-send, transport error) leaked the
  Future entry forever; under churn the dict grew unbounded.
  Happy path still pops via the response receiver, so the finally
  pop is a no-op there.

- #2 (P1): reject a second handshake whose session_id is already
  registered (close code 4001), instead of silently overwriting
  the routing map. session_id format is `<slug>@<4hex>` — 16 bits
  of suffix is locally guessable, so without this any local peer
  could hijack an active session by impersonating its ID.
  Legitimate plugin reconnect after `editor_reload_plugin` first
  triggers ConnectionClosed -> unregister, so the new connect
  still lands cleanly.

Tests:
- tests/integration/test_websocket.py — TestDuplicateHandshake
  pins the reject + reconnect-after-clean-disconnect paths;
  TestPendingFutureCleanup pins the timeout-pop and
  send-failure-pop behaviors.

Live-smoked against a real Godot 4.6 editor: full GDScript suite
1037/1040 passed, duplicate-handshake rejected with close 4001,
original session unaffected.

https://claude.ai/code/session_016ijmCD5S6QfwJGJcc5Wirp
dsarno added a commit that referenced this pull request May 5, 2026
…ake reject (#374)

* Harden WebSocket transport: pending-Future leak + duplicate-ID handshake reject

Two small audit-v2 fixes in src/godot_ai/transport/websocket.py
covering #343 sub-findings #2 and #5. (Sub-finding #4 — errno.EADDRINUSE
portability — landed separately via PR #373.)

- #5 (P2): wrap `send_command`'s `ws.send` + `wait_for` in a
  try/finally that always pops `_pending`. Pre-fix, a `ws.send`
  raise (ConnectionClosed mid-send, transport error) leaked the
  Future entry forever; under churn the dict grew unbounded.
  Happy path still pops via the response receiver, so the finally
  pop is a no-op there.

- #2 (P1): reject a second handshake whose session_id is already
  registered (close code 4001), instead of silently overwriting
  the routing map. session_id format is `<slug>@<4hex>` — 16 bits
  of suffix is locally guessable, so without this any local peer
  could hijack an active session by impersonating its ID.
  Legitimate plugin reconnect after `editor_reload_plugin` first
  triggers ConnectionClosed -> unregister, so the new connect
  still lands cleanly.

Tests:
- tests/integration/test_websocket.py — TestDuplicateHandshake
  pins the reject + reconnect-after-clean-disconnect paths;
  TestPendingFutureCleanup pins the timeout-pop and
  send-failure-pop behaviors.

Live-smoked against a real Godot 4.6 editor: full GDScript suite
1037/1040 passed, duplicate-handshake rejected with close 4001,
original session unaffected.

https://claude.ai/code/session_016ijmCD5S6QfwJGJcc5Wirp

* Apply pending ruff format to test_mcp_tools and test_gdscript_no_adjacent_string_concat

Drive-by: `ruff format --check` was failing on these two test files on beta.
Reformatting them keeps the format-check CI step green for the audit-v2
transport-hardening PR. No behavioral change — purely whitespace /
line-wrapping deltas produced by `ruff format`.

https://claude.ai/code/session_016ijmCD5S6QfwJGJcc5Wirp

---------

Co-authored-by: Claude <noreply@anthropic.com>
dsarno pushed a commit that referenced this pull request May 5, 2026
…tree

Two issues from Copilot's review on PR #381 + the second CI failure:

(A) Cross-scan signal race [Copilot]: scan #1 watchdog'd, scan #2 then
    arms a fresh `filesystem_changed` listener. A delayed emission
    from scan #1 fires on whichever listener is currently connected
    to the shared signal — Godot can't tag emissions with their
    source scan — so scan #2's listener falsely settles before its
    actual filesystem scan completed, re-enabling the plugin against
    a potentially incomplete on-disk install. Generation-counter +
    Callable.bind doesn't help because a single emission still fires
    every connected listener (CONNECT_ONE_SHOT only auto-disconnects
    after firing).

(B) Second CI failure: `Timer.start()` requires the Timer to be
    inside a SceneTree. The previous fix removed `add_child(runner)`
    from tests but didn't parent the runner anywhere, so when
    `_arm_scan_watchdog` called `add_child(timer)` and `timer.start()`
    inside the runner, the timer wasn't in any tree.

Fix:

- Add a sticky `_scan_timed_out` flag set by `_on_scan_watchdog_timeout`.
- `_start_filesystem_scan` checks the flag at the top and bypasses the
  connect + `fs.scan()` path entirely, falling straight through to
  `call_deferred(deferred_step)`. With no listener armed, a delayed
  emission from the timed-out scan has nothing to satisfy. Godot's
  normal background scan catches up after the plugin re-enables.
- Tests now parent the runner via `(Engine.get_main_loop() as
  SceneTree).root.add_child(runner)` so `Timer.start()` works.
  Cleanup via a `_free_runner` helper (remove_child + free).
- New regression test `test_subsequent_scan_after_watchdog_bypasses_
  listener_arm` explicitly drives scan #1 → watchdog → scan #2 and
  asserts scan #2 does NOT set `_waiting_for_scan = true` (i.e. no
  listener armed for a delayed scan-#1 emission to satisfy).
- Existing watchdog test `test_watchdog_no_op_when_signal_already_settled`
  also asserts `_scan_timed_out` stays false on the late-Timer-after-
  successful-signal path — guards against the watchdog poisoning
  subsequent scans when no actual deadlock happened.
dsarno added a commit that referenced this pull request May 5, 2026
…ad (#381)

* [Audit v2 #353] 30s watchdog on filesystem_changed during update reload

Pre-fix, `_start_filesystem_scan` connected `filesystem_changed`
(CONNECT_ONE_SHOT) and called `fs.scan()`, then sat in
`_waiting_for_scan = true` forever if the signal never fired (slow
disk, NFS, AV holding the just-extracted addon files open). The
update completed mechanically — files on disk, plugin disabled —
but the dock stayed disabled because `_finish_scan_wait` was the
only path back to `_enable_new_plugin`. User had to kill and
restart Godot.

Add a 30-second `Timer` armed alongside the signal connect:

- `_arm_scan_watchdog`: lazy-creates a one-shot `Timer` child node
  on first use, reuses it on subsequent arms (the runner does two
  filesystem scans per update — new files then existing files).
- `_on_scan_watchdog_timeout`: if the wait is still open, push a
  warning, disconnect the `filesystem_changed` listener so a
  delayed signal can't double-call, then dispatch
  `_finish_scan_wait`. `_finish_scan_wait`'s existing
  `if not _waiting_for_scan: return` guard makes it idempotent
  against the signal-arrives-just-before-watchdog race.
- `_finish_scan_wait` now also stops the still-running timer on the
  happy path, so a queued late `timeout` can't fire after a
  successful scan.

Worst case after the watchdog fires: the new files aren't visible
on the first frame the new plugin enables, but they get picked up
on the next scan. That's strictly better than the prior dock-
permanently-disabled state.

Tests in `test_project/tests/test_update_reload_runner.gd`:
- `test_watchdog_timeout_proceeds_when_signal_never_fires` — the
  deadlock case the issue describes.
- `test_watchdog_no_op_when_signal_already_settled` — late-Timer
  race after the signal won.
- `test_finish_scan_wait_stops_armed_watchdog` — happy path
  cancels the Timer.
- `test_watchdog_timer_reused_across_arms` — second scan in the
  same update doesn't leak a second Timer child.

Tests fire `_on_scan_watchdog_timeout` and `_finish_scan_wait`
directly rather than waiting for the wall-clock Timer, keeping
runs deterministic and sub-second.

Closes #353

* Fix CI: remove add_child(runner) from watchdog tests

McpTestSuite extends RefCounted (not Node), so add_child(runner) errors
at test runtime — Linux/macOS/Windows all failed with the same root
cause. The tests fire _on_scan_watchdog_timeout() directly instead of
relying on the Timer counting down, so the runner doesn't actually need
to be in a SceneTree; the Timer just needs to be a child of the runner
(works regardless of where the runner sits).

* Redesign per Copilot review: sticky bypass after watchdog + tests in tree

Two issues from Copilot's review on PR #381 + the second CI failure:

(A) Cross-scan signal race [Copilot]: scan #1 watchdog'd, scan #2 then
    arms a fresh `filesystem_changed` listener. A delayed emission
    from scan #1 fires on whichever listener is currently connected
    to the shared signal — Godot can't tag emissions with their
    source scan — so scan #2's listener falsely settles before its
    actual filesystem scan completed, re-enabling the plugin against
    a potentially incomplete on-disk install. Generation-counter +
    Callable.bind doesn't help because a single emission still fires
    every connected listener (CONNECT_ONE_SHOT only auto-disconnects
    after firing).

(B) Second CI failure: `Timer.start()` requires the Timer to be
    inside a SceneTree. The previous fix removed `add_child(runner)`
    from tests but didn't parent the runner anywhere, so when
    `_arm_scan_watchdog` called `add_child(timer)` and `timer.start()`
    inside the runner, the timer wasn't in any tree.

Fix:

- Add a sticky `_scan_timed_out` flag set by `_on_scan_watchdog_timeout`.
- `_start_filesystem_scan` checks the flag at the top and bypasses the
  connect + `fs.scan()` path entirely, falling straight through to
  `call_deferred(deferred_step)`. With no listener armed, a delayed
  emission from the timed-out scan has nothing to satisfy. Godot's
  normal background scan catches up after the plugin re-enables.
- Tests now parent the runner via `(Engine.get_main_loop() as
  SceneTree).root.add_child(runner)` so `Timer.start()` works.
  Cleanup via a `_free_runner` helper (remove_child + free).
- New regression test `test_subsequent_scan_after_watchdog_bypasses_
  listener_arm` explicitly drives scan #1 → watchdog → scan #2 and
  asserts scan #2 does NOT set `_waiting_for_scan = true` (i.e. no
  listener armed for a delayed scan-#1 emission to satisfy).
- Existing watchdog test `test_watchdog_no_op_when_signal_already_settled`
  also asserts `_scan_timed_out` stays false on the late-Timer-after-
  successful-signal path — guards against the watchdog poisoning
  subsequent scans when no actual deadlock happened.

---------

Co-authored-by: Claude <noreply@anthropic.com>
dsarno added a commit that referenced this pull request Jun 3, 2026
…#518) (#519)

The opaque INTERNAL_ERROR 'regression' on game_eval (1.42%->6.15%) was a thin-baseline cohort artifact plus #500 reclassifying ~15s TimeoutErrors into fast ~3s INTERNAL_ERRORs; the genuine ~10s hang is flat ~2.8%. Carve the play-session-up-but-capture-not-ready failure into its own EVAL_GAME_NOT_READY code (mirrors #491, and is #518's Suggested Action #2) so the opaque bucket again means 'the eval truly hung'. Relabel only; no timing machinery touched.

Refs #518.
dsarno added a commit that referenced this pull request Jun 9, 2026
…dator

The strong traversal validator from #347 was wired into only filesystem and
script handlers; ~12 other path-taking handlers used a bare begins_with("res://")
check (which does not reject "..") or no check at all (relying solely on
ResourceLoader.exists/load). This unifies all of them on McpPathValidator and
extends the validator with two checks.

Validator (utils/path_validator.gd):
- Reject embedded null bytes (truncation trap — the path written could differ
  from the one validated).
- New for_write flag: write callers additionally refuse res://project.godot,
  the res://.godot/ metadata dir, and .import sidecars (overwriting these
  corrupts the project / import cache). Reads still permit inspecting them.
  Signature is backward-compatible (for_write defaults to false).

Writers now validate with for_write=true (closes the traversal-write primitive):
  resource_io.save_to_disk (backs resource/environment/texture/curve saves),
  scene create_scene/save_scene_as, material/theme save helpers, filesystem
  write_text, script create/patch.

Load/read sites now validate (closes the unvalidated-load surface, incl. the
@tool-script-execution risk from open_scene/create_node):
  resource load/assign + nested object-property loads, scene open_scene,
  script attach_script, node create_node scene_path + set_property resource
  values, audio set_stream + list root, material assign/shader/list,
  ui theme + stylebox, autoload path, curve set_points, particle mesh/
  material/texture, material_values.load_texture.

Path-validation rejections report VALUE_OUT_OF_RANGE (the code these handlers
already used for "must start with res://"), keeping the INVALID_PARAMS catch-all
count under the audit-v2 #21 ceiling enforced by test_error_code_distribution.
The four pre-existing validator sites that already wrapped errors as
INVALID_PARAMS (filesystem, script create/read/patch/find, resource_io save,
material _validate_material_path) are left unchanged.

set_stream and the other load handlers now reject user:// (consistent with the
validator's existing test_user_prefix_rejected policy); the audio test fixture
moves from user:// to a res:// .tres registered via EditorFileSystem.update_file.

Tests (run against live Godot 4.6.3): validator unit tests for null-byte +
write blocklist + read-allows; scene traversal/manifest-overwrite rejection.
All affected suites pass (path_validator/scene/resource/node/material/theme/
curve/particle/ui/audio/autoload/filesystem/script). test_error_code_distribution
passes. Parse check clean.

Addresses advisory GHSA-p5x8-v25q-qw69 (GH-1, GH-2, GH-3, GH-4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
dsarno added a commit that referenced this pull request Jun 10, 2026
…dator

The strong traversal validator from #347 was wired into only filesystem and
script handlers; ~12 other path-taking handlers used a bare begins_with("res://")
check (which does not reject "..") or no check at all (relying solely on
ResourceLoader.exists/load). This unifies all of them on McpPathValidator and
extends the validator with two checks.

Validator (utils/path_validator.gd):
- Reject embedded null bytes (truncation trap — the path written could differ
  from the one validated).
- New for_write flag: write callers additionally refuse res://project.godot,
  the res://.godot/ metadata dir, and .import sidecars (overwriting these
  corrupts the project / import cache). Reads still permit inspecting them.
  Signature is backward-compatible (for_write defaults to false).

Writers now validate with for_write=true (closes the traversal-write primitive):
  resource_io.save_to_disk (backs resource/environment/texture/curve saves),
  scene create_scene/save_scene_as, material/theme save helpers, filesystem
  write_text, script create/patch.

Load/read sites now validate (closes the unvalidated-load surface, incl. the
@tool-script-execution risk from open_scene/create_node):
  resource load/assign + nested object-property loads, scene open_scene,
  script attach_script, node create_node scene_path + set_property resource
  values, audio set_stream + list root, material assign/shader/list,
  ui theme + stylebox, autoload path, curve set_points, particle mesh/
  material/texture, material_values.load_texture.

Path-validation rejections report VALUE_OUT_OF_RANGE (the code these handlers
already used for "must start with res://"), keeping the INVALID_PARAMS catch-all
count under the audit-v2 #21 ceiling enforced by test_error_code_distribution.
The four pre-existing validator sites that already wrapped errors as
INVALID_PARAMS (filesystem, script create/read/patch/find, resource_io save,
material _validate_material_path) are left unchanged.

set_stream and the other load handlers now reject user:// (consistent with the
validator's existing test_user_prefix_rejected policy); the audio test fixture
moves from user:// to a res:// .tres registered via EditorFileSystem.update_file.

Tests (run against live Godot 4.6.3): validator unit tests for null-byte +
write blocklist + read-allows; scene traversal/manifest-overwrite rejection.
All affected suites pass (path_validator/scene/resource/node/material/theme/
curve/particle/ui/audio/autoload/filesystem/script). test_error_code_distribution
passes. Parse check clean.

Addresses advisory GHSA-p5x8-v25q-qw69 (GH-1, GH-2, GH-3, GH-4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
dsarno added a commit that referenced this pull request Jun 10, 2026
…dator (#546)

* harden(handlers): route every path-taking handler through McpPathValidator

The strong traversal validator from #347 was wired into only filesystem and
script handlers; ~12 other path-taking handlers used a bare begins_with("res://")
check (which does not reject "..") or no check at all (relying solely on
ResourceLoader.exists/load). This unifies all of them on McpPathValidator and
extends the validator with two checks.

Validator (utils/path_validator.gd):
- Reject embedded null bytes (truncation trap — the path written could differ
  from the one validated).
- New for_write flag: write callers additionally refuse res://project.godot,
  the res://.godot/ metadata dir, and .import sidecars (overwriting these
  corrupts the project / import cache). Reads still permit inspecting them.
  Signature is backward-compatible (for_write defaults to false).

Writers now validate with for_write=true (closes the traversal-write primitive):
  resource_io.save_to_disk (backs resource/environment/texture/curve saves),
  scene create_scene/save_scene_as, material/theme save helpers, filesystem
  write_text, script create/patch.

Load/read sites now validate (closes the unvalidated-load surface, incl. the
@tool-script-execution risk from open_scene/create_node):
  resource load/assign + nested object-property loads, scene open_scene,
  script attach_script, node create_node scene_path + set_property resource
  values, audio set_stream + list root, material assign/shader/list,
  ui theme + stylebox, autoload path, curve set_points, particle mesh/
  material/texture, material_values.load_texture.

Path-validation rejections report VALUE_OUT_OF_RANGE (the code these handlers
already used for "must start with res://"), keeping the INVALID_PARAMS catch-all
count under the audit-v2 #21 ceiling enforced by test_error_code_distribution.
The four pre-existing validator sites that already wrapped errors as
INVALID_PARAMS (filesystem, script create/read/patch/find, resource_io save,
material _validate_material_path) are left unchanged.

set_stream and the other load handlers now reject user:// (consistent with the
validator's existing test_user_prefix_rejected policy); the audio test fixture
moves from user:// to a res:// .tres registered via EditorFileSystem.update_file.

Tests (run against live Godot 4.6.3): validator unit tests for null-byte +
write blocklist + read-allows; scene traversal/manifest-overwrite rejection.
All affected suites pass (path_validator/scene/resource/node/material/theme/
curve/particle/ui/audio/autoload/filesystem/script). test_error_code_distribution
passes. Parse check clean.

Addresses advisory GHSA-p5x8-v25q-qw69 (GH-1, GH-2, GH-3, GH-4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* harden(handlers): address review — uid:// loads, case-fold blocklist, shared helper

Follow-up to the McpPathValidator unification, fixing the code-review findings.

Correctness:
- Restore uid:// and user:// support on load handlers. ResourceLoader accepts
  both (uid:// is an opaque resource id that can't express traversal; user://
  is the user data sandbox), but routing every load through the strict res://
  validator rejected them — a regression for uid:// refs copied from .tscn/.uid
  and for user:// runtime assets. New validate_loadable_path() accepts res://
  (confined), uid:// (as-is), and user:// (confined under the user root), and
  load sites now use it: set_property/assign_resource/resource property dicts,
  open_scene, create_node, attach_script, set_stream, particle mesh/material/
  texture, material assign/shader, ui theme/stylebox, curve, material_values.
- Case-fold the write blocklist. macOS (APFS) and Windows (NTFS) are
  case-insensitive by default, so res://Project.godot resolved to the real
  project.godot and slipped past a case-sensitive compare.
- Add override.cfg to the write blocklist (applied over project.godot at
  startup — same takeover surface as the manifest).
- Reads no longer run the write blocklist: _validate_material_path /
  _validate_res_path / _load_material_from_path / _load_theme_from_params take
  for_write, passed true only by the create/save callers. get_material and
  apply_theme (pure reads) no longer return a spurious "Refusing to write".
- Stop blocking .import writes — editing import config then reimporting is a
  legitimate, recoverable workflow; the blocklist is the startup-execution
  surface (manifest, override.cfg, .godot/) only.

Cleanup / altitude:
- Single error-code choke point: McpPathValidator.path_error / loadable_error
  return a ready error dict (MISSING_REQUIRED_PARAM for empty, VALUE_OUT_OF_RANGE
  for invalid) so every handler reports the same code for the same failure.
  filesystem/script move off their old INVALID_PARAMS wrapping onto this.
- curve set_points validates the load path (the save layer, resource_io.save_to_disk,
  remains the authoritative write-confinement check) instead of double-running
  the write validator.
- Drop the redundant is_empty() pre-check in material_values.load_texture.
- Document the lexical-containment (no symlink resolution) limitation in the
  validator — GDScript has no realpath; the loopback boundary is the control.

Tests (live Godot 4.6.3, all suites 0 failures): uid:// / user:// acceptance,
user:// traversal + unknown-scheme rejection, case-insensitive blocklist,
override.cfg block, .import now allowed; audio fixture moved back to user://
(no repo/EFS pollution). test_path_traversal_guard counts any McpPathValidator
delegation; missing-path tests now assert MISSING_REQUIRED_PARAM.

Addresses review findings 1-10 on GHSA-p5x8-v25q-qw69 (path confinement).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* harden(handlers): address Copilot review on path validation

- Guard the null-byte check against an empty String.chr(0) sentinel in both
  validate_resource_path and validate_loadable_path. On builds that normalize
  embedded nulls away (e.g. 4.3), contains("") would be true and reject every
  path; a String that can't hold a null can't smuggle one, so skip the check.
- Update stale "res:// path" comments in ui_handler (stylebox override) and
  material_values.load_texture — both now accept uid:// / user:// via
  validate_loadable_path.
- Tighten test_path_traversal_guard: assert attach_script is present and require
  >=5 McpPathValidator delegations in script_handler, so a regression where
  attach_script stops validating its path is caught.

Live Godot 4.6.3: path_validator/filesystem/script/ui/material/resource/scene/
audio suites all pass; test_path_traversal_guard green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* style: wrap long func-name tuple in test_path_traversal_guard (ruff E501)

The 5-entry func list exceeded the 100-char line limit after adding
attach_script. Pure formatting; the assertion is unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(handlers): align remaining res:// messages with uid:// / user:// support

Copilot re-review: now that the theme / stylebox / shader load paths accept
uid:// and user:// via loadable_error, the surrounding comments, the
build_layout docstring, the stylebox fallback error ("expects a res:// path"),
and the shader_path missing-arg error still said res:// only. Text-only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants