Skip to content

Commit 3295ded

Browse files
dsarnoclaude
andauthored
[Audit v2 #345] Reject non-loopback Host/Origin (DNS-rebinding guard) (#375)
* Reject non-loopback Host/Origin on WS + HTTP transports (audit-v2 #1) Closes #345 — refs #343 (audit-v2 umbrella). Both transports bound to 127.0.0.1, but a malicious browser tab could mount **DNS rebinding**: resolve attacker.example.com to 127.0.0.1, then `new WebSocket("ws://attacker.example.com:9500")`. The request lands on our loopback socket carrying a non-localhost Host (and Origin), the WS handshake registered the peer as a session, and any MCP tool became drivable from a foreign origin — including write tools. The streamable-HTTP transport on :8000 had the same gap, including `/godot-ai/status` (comment-marked "small unauthenticated probe"). Per the issue's "Fix shape": **strict Host/Origin allowlist in a Starlette middleware ahead of FastMCP** plus a matching `process_request` hook on the WebSocket server. `src/godot_ai/transport/origin_guard.py` (new): - `is_allowed_host` — accepts `127.0.0.1`, `localhost`, `[::1]` with optional `:port`; case-insensitive; bare `::1` rejected (RFC 7230 requires bracketed IPv6 in Host). - `is_allowed_origin` — None / empty / `null` accepted (native clients omit Origin; sandboxed/file:// emit `null`); otherwise must parse to a URL whose hostname is loopback. Schemes outside `http/https/ws/wss` rejected. - `LocalhostOnlyHTTPMiddleware` — ASGI middleware. Rejects on first failure with HTTP 403; passes lifespan scopes through; handles duplicate-Host smuggling shapes (fail closed when the same header appears more than once). - `make_websocket_request_guard()` — `process_request` hook for `websockets.asyncio.server.serve(...)` mirroring the same logic. Uses `headers.get_all` so a smuggled duplicate Host fails closed rather than tripping `MultipleValuesError` and surfacing as 500. Wiring: - `transport/websocket.py`: pass `process_request=` to `serve()`. - `server.py`: outermost wrap on the HTTP app, applied to every HTTP transport (`http`, `streamable-http`, `sse`) so `/godot-ai/status` and the FastMCP endpoints are guarded uniformly. Native clients keep working: the Godot plugin's WebSocketPeer sends `Host: 127.0.0.1:<port>` and no Origin. Verified by smoke against a real WebSocketServer (loopback connect → `handshake_ack`; rebound Origin → 403) and against the streamable-HTTP transport (`/godot-ai/status`: loopback → 200, `Host: attacker.example.com` → 403, loopback Host + browser Origin → 403). Tests: - `tests/unit/test_origin_guard.py` (44 tests): parametrized helper coverage (loopback / sneaky-substring / IP / malformed / multiple smuggled values), middleware behavior, lifespan passthrough, `state` `__getattr__` passthrough. - `tests/integration/test_websocket.py::TestDnsRebindingGuard` (5 tests): live `websockets` server + client. Loopback succeeds, non-loopback Host returns 403, non-loopback Origin returns 403, loopback-shaped Origin succeeds, rejected request never registers a session. - `tests/unit/test_asgi_session_diagnostics.py`: structural assertions updated for the new outer wrap; FastMCP `state` lookup now traverses one extra layer. - `tests/unit/test_server_status.py`: `TestClient(base_url=...)` set to a loopback host so the request passes the new guard. Test plan: - ruff check + format: clean - pytest -q: 825 passed (+45 new) - script/ci-check-gdscript: clean (no GDScript touched) - Live smoke: real `GodotWebSocketServer` and real streamable-HTTP app, loopback succeeds, non-loopback Host/Origin → 403. Note: PR #366 (audit-v2 #2, session-hijack) closes the duplicate-handshake takeover. This PR closes the rebinding-from-browser path. Together they harden the WebSocket transport at both layers (who can connect, and what they can do once they have). * Simplify: extract evaluate_loopback so WS + HTTP guards can't drift Both the WebSocket ``process_request`` hook and ``LocalhostOnlyHTTPMiddleware`` implemented the same 3-step rule (count duplicate headers → check ``is_allowed_host`` → check ``is_allowed_origin`` → 403) twice, with slightly different intermediate state. Funnel both through one ``evaluate_loopback(hosts, origins) -> bool`` so a regression in one transport cannot accidentally diverge from the other. Drive-by: - Decode FORBIDDEN_BODY once at module load (FORBIDDEN_BODY_TEXT) so the WS guard doesn't decode the same bytes per rejection. - Drop redundant ``parsed.scheme.lower()`` — ``urlsplit`` already normalises the scheme per RFC 3986. No behavior change. 825 tests pass. * Address Copilot review: reject Origin: null + Sec-Fetch-Site cross-origin Two browser-side liveness/bypass shapes that the first cut of the loopback guard let through: 1. **``Origin: null``** (Copilot, origin_guard.py:85). Sandboxed ``<iframe sandbox>`` and downloaded ``file://`` pages serialize their origin as ``null``. The original guard accepted this on the theory that file:// /sandboxed contexts are "legitimate"; in practice they're exactly the bypass an attacker would use to bridge a foreign origin onto our loopback socket. **Now rejected.** Native clients never produce ``null`` (they omit Origin entirely), so this tightening is invisible to the Godot plugin / FastMCP CLI / curl. 2. **Cross-origin no-cors subresource probes** (Copilot, server.py:79). `<img src="http://127.0.0.1:9500/godot-ai/status">` from ``https://attacker.example.com`` arrives with a loopback ``Host`` and *no* ``Origin`` (browsers omit Origin for ``no-cors`` GETs of ``<img>`` / ``<script>`` / ``<link>``), so the original Host/Origin gate let it through and the page could use the 200/403 outcome as a liveness oracle. Browsers stamp every HTTP request with ``Sec-Fetch-Site`` (``cross-site`` / ``same-site`` / ``same-origin`` / ``none``). Native non-browser clients never send it. **The guard now rejects ``cross-site`` and ``same-site``** while still allowing ``none`` (top-level navigation: user typed URL, bookmark) and ``same-origin`` (loopback page fetching its own server). Missing → allow (native client), preserving the existing flow. Drive-by: trailing-dot loopback (``Host: localhost.``) now accepted. RFC 1034 valid FQDN syntax that browsers and curl can preserve through to the Host header — friction trap if rejected, no security implication either way. `evaluate_loopback` now takes an optional ``sec_fetch_sites`` list so both transports run the new policy through the same single helper — preserves the audit-v2 invariant that the WS hook and HTTP middleware cannot drift. Tests: - New parametrized helper coverage for ``is_allowed_sec_fetch_site`` (friendly + foreign + case-insensitive + whitespace), trailing-dot hosts, and explicit ``Origin: null`` rejection. - Two new ``evaluate_loopback`` cases pinning the cross-site rejection with no-Origin (the exact Copilot shape) and ``Origin: null`` with any Sec-Fetch-Site value. - Three new ``LocalhostOnlyHTTPMiddleware`` middleware cases: ``Origin: null`` rejected, cross-site subresource rejected, top-level navigation accepted. - Three new live-WS integration cases pinning the same rules end-to-end through the websockets ``process_request`` hook, plus a bracketed-IPv6 ``http://[::1]:9500`` symmetry test for the WS path. Live smoke: native loopback connect → ``handshake_ack`` ✓; ``Origin: null`` WS connect → 403 ✓; ``Sec-Fetch-Site: cross-site`` WS connect → 403 ✓; ``Origin: null`` HTTP probe → 403 ✓; cross-site HTTP probe → 403 ✓. --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 39366fd commit 3295ded

7 files changed

Lines changed: 853 additions & 6 deletions

File tree

src/godot_ai/server.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
from godot_ai.tools.testing import register_testing_tools
5252
from godot_ai.tools.theme import register_theme_tools
5353
from godot_ai.tools.ui import register_ui_tools
54+
from godot_ai.transport.origin_guard import LocalhostOnlyHTTPMiddleware
5455
from godot_ai.transport.websocket import GodotWebSocketServer
5556

5657
logger = logging.getLogger(__name__)
@@ -70,8 +71,12 @@ def http_app(self, *args: Any, **kwargs: Any):
7071
app = super().http_app(*args, **kwargs)
7172
transport = kwargs.get("transport", "http")
7273
if transport in ("http", "streamable-http"):
73-
return StaleMcpSessionDiagnosticMiddleware(app)
74-
return app
74+
app = StaleMcpSessionDiagnosticMiddleware(app)
75+
## Outermost wrap: refuse non-loopback Host/Origin (DNS-rebinding
76+
## guard, audit-v2 finding #1). Applied to every HTTP transport
77+
## including ``sse`` so ``/godot-ai/status`` and the FastMCP
78+
## endpoints are guarded uniformly.
79+
return LocalhostOnlyHTTPMiddleware(app)
7580

7681

7782
def create_server(
Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
"""Loopback Host/Origin guard for the WebSocket and HTTP transports.
2+
3+
The WebSocket server binds to ``127.0.0.1`` and the streamable-HTTP
4+
transport likewise. That stops *direct* off-host traffic but does not
5+
stop a browser tab on a malicious origin from mounting **DNS rebinding**:
6+
the browser resolves ``attacker.example.com`` to ``127.0.0.1`` and then
7+
issues ``new WebSocket("ws://attacker.example.com:9500")``. The request
8+
lands on our loopback socket carrying a non-localhost ``Host`` (and
9+
``Origin``) header.
10+
11+
This module enforces a Host/Origin allowlist that rejects those rebound
12+
requests *before* the WebSocket upgrade runs or any HTTP route fires:
13+
14+
- ``Host`` must resolve to one of ``127.0.0.1``, ``localhost`` or
15+
``[::1]`` — with an optional ``:port``. RFC 7230 requires bracketed
16+
form for IPv6 in HTTP/1.1 Host headers, so a bare ``::1`` is not
17+
accepted (would be a malformed request).
18+
- ``Origin`` is validated only when present (native non-browser clients
19+
omit it). A present Origin must be empty or a URL whose hostname
20+
matches the loopback allowlist. ``Origin: null`` is **rejected** —
21+
browsers emit it from sandboxed iframes and downloaded ``file://``
22+
pages, which is exactly the rebinding-bypass shape. Native clients
23+
do not produce ``null`` (they omit Origin entirely).
24+
- ``Sec-Fetch-Site`` is also checked when present: any value other than
25+
``same-origin`` / ``none`` (i.e. browser-issued cross-origin
26+
subresources or navigations from a foreign page) is refused. This
27+
catches `<img src=...>` / `<link>` / `<script>` liveness oracles
28+
against ``/godot-ai/status``, where browsers send a loopback ``Host``
29+
and *no* ``Origin`` for "no-cors" subresource loads. Native clients
30+
never send ``Sec-Fetch-*`` (it's a Fetch-Metadata header set only by
31+
browsers), so missing means "allow".
32+
33+
Native clients (the Godot plugin, the FastMCP CLI client, ``curl`` with
34+
no ``-H Origin``) keep working unchanged. Browser-driven traffic — even
35+
``no-cors`` subresources that wouldn't carry an Origin — is refused with
36+
HTTP 403 long before reaching FastMCP or our session registry.
37+
38+
See umbrella #343, finding #1 (audit-v2).
39+
"""
40+
41+
from __future__ import annotations
42+
43+
from http import HTTPStatus
44+
from typing import Any
45+
from urllib.parse import urlsplit
46+
47+
from starlette.types import ASGIApp, Receive, Scope, Send
48+
49+
LOOPBACK_HOSTNAMES: frozenset[str] = frozenset({"127.0.0.1", "localhost", "[::1]"})
50+
LOOPBACK_ORIGIN_SCHEMES: frozenset[str] = frozenset({"http", "https", "ws", "wss"})
51+
52+
FORBIDDEN_BODY = (
53+
b"forbidden: non-loopback Host or Origin (DNS rebinding guard)\n"
54+
b"see https://github.com/hi-godot/godot-ai issue #345 for details\n"
55+
)
56+
FORBIDDEN_BODY_TEXT = FORBIDDEN_BODY.decode("utf-8")
57+
58+
59+
## Sec-Fetch-Site values that indicate the request is browser-driven and
60+
## NOT a top-level navigation or same-origin operation. Modern browsers
61+
## always send Sec-Fetch-Site on HTTP requests (including ``no-cors``
62+
## subresources like ``<img>`` / ``<script>`` / ``<link>`` that carry
63+
## *no* Origin); native non-browser clients never send it.
64+
SEC_FETCH_SITE_FOREIGN: frozenset[str] = frozenset({"cross-site", "same-site"})
65+
66+
67+
def _normalise_host(host: str) -> str:
68+
"""Return ``host`` with a trailing ``:port`` stripped, lowercased,
69+
and with any single trailing DNS root dot removed.
70+
71+
Handles bracketed IPv6 (``[::1]:9500`` → ``[::1]``), bare-name forms
72+
(``LOCALHOST:8000`` → ``localhost``), and the rare-but-valid trailing
73+
dot (``localhost.`` → ``localhost``) so the allowlist lookup is
74+
independent of caller punctuation.
75+
"""
76+
if not host:
77+
return host
78+
if host.startswith("[") and "]" in host:
79+
without_port = host[: host.index("]") + 1]
80+
else:
81+
without_port = host.split(":", 1)[0]
82+
normalised = without_port.lower()
83+
if normalised.endswith(".") and not normalised.endswith(".]"):
84+
normalised = normalised.rstrip(".")
85+
return normalised
86+
87+
88+
def is_allowed_host(host_header: str | None) -> bool:
89+
"""Whether ``host_header`` resolves to a loopback name.
90+
91+
Empty or missing returns False — a properly formed HTTP/1.1 request
92+
always carries a Host header, and refusing the request is safer than
93+
guessing. The WebSocket guard mirrors this.
94+
"""
95+
if not host_header:
96+
return False
97+
return _normalise_host(host_header.strip()) in LOOPBACK_HOSTNAMES
98+
99+
100+
def is_allowed_origin(origin_header: str | None) -> bool:
101+
"""Whether ``origin_header`` is absent or names a loopback URL.
102+
103+
Native clients do not send Origin. Browsers always do, and the
104+
request is rejected unless the Origin parses to a loopback URL.
105+
``Origin: null`` is rejected — sandboxed iframes and downloaded
106+
``file://`` pages emit it, which is the exact bypass an attacker
107+
would use to bridge a foreign origin onto our loopback socket.
108+
"""
109+
if origin_header is None:
110+
return True
111+
value = origin_header.strip()
112+
if not value:
113+
return True
114+
if value.lower() == "null":
115+
return False
116+
parsed = urlsplit(value)
117+
## ``urlsplit`` already lowercases the scheme per RFC 3986, so no
118+
## extra normalization is needed before the set lookup.
119+
if parsed.scheme not in LOOPBACK_ORIGIN_SCHEMES:
120+
return False
121+
if not parsed.hostname:
122+
return False
123+
hostname = parsed.hostname.lower().rstrip(".")
124+
# urlsplit strips IPv6 brackets — re-add for the bracketed-form lookup.
125+
bracketed = f"[{hostname}]" if ":" in hostname else hostname
126+
return hostname in LOOPBACK_HOSTNAMES or bracketed in LOOPBACK_HOSTNAMES
127+
128+
129+
def is_allowed_sec_fetch_site(value: str | None) -> bool:
130+
"""Whether the ``Sec-Fetch-Site`` header indicates a non-foreign request.
131+
132+
Modern browsers stamp every HTTP request with one of ``cross-site``,
133+
``same-site``, ``same-origin`` or ``none`` (top-level navigation /
134+
bookmark). Native clients never send it. Treat missing as "allow"
135+
(native client) and the foreign values as "reject" — the rest of the
136+
allowlist still has to pass, this is just an early-out for the
137+
`<img src=...>` / `<script src=...>` cross-origin probe shape that
138+
would otherwise slip past a loopback Host / missing Origin.
139+
"""
140+
if value is None:
141+
return True
142+
return value.strip().lower() not in SEC_FETCH_SITE_FOREIGN
143+
144+
145+
def evaluate_loopback(
146+
hosts: list[str],
147+
origins: list[str],
148+
sec_fetch_sites: list[str] | None = None,
149+
) -> bool:
150+
"""Return True iff the request's headers pass the allowlist.
151+
152+
Both transports (ASGI middleware + WebSocket ``process_request``)
153+
funnel their per-request header extraction through this helper so
154+
the duplicate-header smuggling rule, the value-allowlist rule, and
155+
the Sec-Fetch-Site cross-origin reject rule are evaluated identically.
156+
A divergence between the two transports would be a security
157+
regression — this helper exists to prevent it.
158+
"""
159+
if len(hosts) > 1 or len(origins) > 1:
160+
return False
161+
if sec_fetch_sites and len(sec_fetch_sites) > 1:
162+
return False
163+
host = hosts[0] if hosts else None
164+
origin = origins[0] if origins else None
165+
sec_fetch_site = sec_fetch_sites[0] if sec_fetch_sites else None
166+
return (
167+
is_allowed_host(host)
168+
and is_allowed_origin(origin)
169+
and is_allowed_sec_fetch_site(sec_fetch_site)
170+
)
171+
172+
173+
class LocalhostOnlyHTTPMiddleware:
174+
"""ASGI middleware that rejects HTTP requests off the loopback allowlist.
175+
176+
Wraps the FastMCP ASGI app so the guard runs *before* the MCP
177+
streamable-HTTP session manager, before ``/godot-ai/status``, and
178+
before any inner middleware. Non-HTTP scopes (lifespan) pass through.
179+
"""
180+
181+
def __init__(self, app: ASGIApp) -> None:
182+
self.app = app
183+
184+
def __getattr__(self, name: str) -> Any:
185+
# Mirror StaleMcpSessionDiagnosticMiddleware: FastMCP / uvicorn
186+
# introspect attributes (e.g. ``state``) on the wrapped ASGI app.
187+
return getattr(self.app, name)
188+
189+
async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
190+
if scope["type"] != "http":
191+
await self.app(scope, receive, send)
192+
return
193+
194+
hosts: list[str] = []
195+
origins: list[str] = []
196+
sec_fetch_sites: list[str] = []
197+
for raw_key, raw_value in scope.get("headers", []):
198+
key = raw_key.lower()
199+
if key == b"host":
200+
hosts.append(raw_value.decode("latin-1"))
201+
elif key == b"origin":
202+
origins.append(raw_value.decode("latin-1"))
203+
elif key == b"sec-fetch-site":
204+
sec_fetch_sites.append(raw_value.decode("latin-1"))
205+
206+
if evaluate_loopback(hosts, origins, sec_fetch_sites):
207+
await self.app(scope, receive, send)
208+
return
209+
await _send_forbidden(send)
210+
211+
212+
async def _send_forbidden(send: Send) -> None:
213+
await send(
214+
{
215+
"type": "http.response.start",
216+
"status": HTTPStatus.FORBIDDEN,
217+
"headers": [
218+
(b"content-type", b"text/plain; charset=utf-8"),
219+
(b"content-length", str(len(FORBIDDEN_BODY)).encode("ascii")),
220+
],
221+
}
222+
)
223+
await send({"type": "http.response.body", "body": FORBIDDEN_BODY, "more_body": False})
224+
225+
226+
def make_websocket_request_guard():
227+
"""Return a ``process_request`` hook for ``websockets.asyncio.server.serve``.
228+
229+
The hook fires before the WebSocket upgrade. When Host or Origin
230+
fails the loopback allowlist the hook synthesizes an HTTP 403 via
231+
``connection.respond(...)``; returning that response from
232+
``process_request`` aborts the upgrade without ever creating a
233+
Session.
234+
"""
235+
236+
async def guard(connection, request):
237+
## Use ``get_all`` so a smuggled duplicate (two ``Host:`` lines)
238+
## fails closed rather than tripping ``MultipleValuesError`` at
239+
## ``request.headers.get(...)`` and surfacing as an opaque 500.
240+
hosts = list(request.headers.get_all("Host"))
241+
origins = list(request.headers.get_all("Origin"))
242+
sec_fetch_sites = list(request.headers.get_all("Sec-Fetch-Site"))
243+
if evaluate_loopback(hosts, origins, sec_fetch_sites):
244+
return None
245+
return connection.respond(HTTPStatus.FORBIDDEN, FORBIDDEN_BODY_TEXT)
246+
247+
return guard

src/godot_ai/transport/websocket.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from godot_ai import __version__ as _SERVER_VERSION
1515
from godot_ai.protocol.envelope import CommandRequest, CommandResponse, HandshakeMessage
1616
from godot_ai.sessions.registry import Session, SessionRegistry
17+
from godot_ai.transport.origin_guard import make_websocket_request_guard
1718

1819
logger = logging.getLogger(__name__)
1920

@@ -37,6 +38,10 @@ async def start(self):
3738
"127.0.0.1",
3839
self.port,
3940
max_size=4 * 1024 * 1024, # 4 MB for screenshot base64
41+
# Reject DNS-rebinding attempts before the upgrade — see
42+
# godot_ai.transport.origin_guard. Native plugin clients
43+
# carry a loopback Host and no Origin, so they pass through.
44+
process_request=make_websocket_request_guard(),
4045
):
4146
await asyncio.Future() # run forever
4247
except OSError as e:

0 commit comments

Comments
 (0)