Skip to content

[Audit v2 #351] Pydantic-validate WebSocket event payloads#378

Merged
dsarno merged 1 commit into
betafrom
claude/audit-v2-351-validate-ws-events
May 5, 2026
Merged

[Audit v2 #351] Pydantic-validate WebSocket event payloads#378
dsarno merged 1 commit into
betafrom
claude/audit-v2-351-validate-ws-events

Conversation

@dsarno

@dsarno dsarno commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes the audit-v2 #7 finding from #343: WebSocket event payloads are now Pydantic-validated before they touch typed Session fields.

_handle_event previously did event_data.get(...) blind assignment — HandshakeMessage and CommandResponse were validated, but the three plugin-emitted state events (scene_changed, play_state_changed, readiness_changed) weren't. A malformed plugin event (or a hijacked WS via the surface #345 / #346 covered) put non-string values straight into Session.current_scene / play_state / readiness, which then shipped to MCP clients verbatim through Session.to_dict().

Fix

  • New Pydantic models in src/godot_ai/protocol/envelope.py: SceneChangedEvent, PlayStateChangedEvent, ReadinessChangedEvent. Each carries one typed string field; defaults match the old .get(default) shape so a missing key still falls through to the same safe value the caller had been assigning.
  • _handle_event runs each known event payload through model_validate() inside a try block. On ValidationError the event is dropped with a structured warning log naming the event type + session prefix — the typed Session field stays at its current value rather than being corrupted.
  • Unknown event types remain silently ignored (forward-compat for future plugin versions).

Diff is small: +140 / −10 across the three files.

Test plan

  • ruff check src/ tests/ — clean
  • ruff format --check on touched files — clean
  • pytest -q872 passed (5 new in TestEventValidation)
  • script/ci-check-gdscript — All GDScript files OK
  • New regression tests (tests/integration/test_websocket.py::TestEventValidation):
    • test_scene_changed_with_non_string_payload_is_dropped — sends {"current_scene": 12345}, asserts the typed field is unchanged + the warning was logged
    • test_play_state_changed_with_non_string_payload_is_dropped — sends a list payload
    • test_readiness_changed_with_non_string_payload_is_dropped — sends a nested dict
    • test_unknown_event_type_is_silently_ignored — forward-compat: unknown event names don't raise or mutate state
    • test_valid_event_after_malformed_one_still_applies — recovery: a valid event after a dropped one still updates the typed field on the same connection
  • Existing TestEvents (3 tests covering the happy path) still passes
  • Live editor test_run skipped — pure Python transport-layer change, no GDScript or handler edits

Deviations from "Fix shape"

None. The issue called for "add SceneChangedEvent, PlayStateChangedEvent, ReadinessChangedEvent Pydantic models in protocol/envelope.py; validate in _handle_event." That's the diff.

Cross-references

Closes #351
Umbrella: #343 (next reliability sweep — adjacent to #345 / #346 / #349 transport-hardening landings)

https://claude.ai/code/session_01ERwAhFK6CEZLRigwK1iC2k


Generated by Claude Code

`_handle_event` blindly assigned `event_data.get(...)` to typed Session
fields; HandshakeMessage and CommandResponse were validated, but events
weren't. A malformed plugin event (or hijacked WS via #345/#346 surface)
shipped non-string values straight to MCP clients via Session.to_dict().

Add three Pydantic models in protocol/envelope.py — SceneChangedEvent,
PlayStateChangedEvent, ReadinessChangedEvent — and route the three
event branches in _handle_event through model_validate(). On
ValidationError the event is dropped with a structured warning log
(naming the event type and session) rather than corrupting the typed
Session fields.

The model defaults match the previous .get(default) shape so a missing
key falls through to the same safe value the caller had been assigning.

Tests: tests/integration/test_websocket.py::TestEventValidation
- non-string payload for each of the three event types is dropped and
  the corresponding Session field is unchanged
- unknown event types are silently ignored (forward-compat)
- a valid event after a dropped malformed one still applies (recovery)

Closes #351

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses audit-v2 finding #7 (#351) by ensuring plugin-emitted WebSocket state event payloads are Pydantic-validated before mutating typed Session fields, preventing malformed event data from being propagated to MCP clients via Session.to_dict().

Changes:

  • Added Pydantic models for scene_changed, play_state_changed, and readiness_changed event payloads in protocol/envelope.py.
  • Updated GodotWebSocketServer._handle_event() to validate known event payloads with model_validate(), dropping malformed events with a warning log on ValidationError.
  • Added integration tests covering malformed payload dropping, forward-compatible ignoring of unknown event types, and recovery (valid event after malformed).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/godot_ai/protocol/envelope.py Introduces typed Pydantic models for plugin state event payloads with safe defaults.
src/godot_ai/transport/websocket.py Validates known event payloads before mutating Session state; logs and drops malformed payloads.
tests/integration/test_websocket.py Adds regression tests ensuring malformed events are dropped, unknown events are ignored, and recovery works.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov

codecov Bot commented May 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

dsarno commented May 5, 2026

Copy link
Copy Markdown
Contributor Author

Post-PR shepherding complete — ready for merge

All four required signals green on SHA 5f37ea5:

  • CI: 15/15 checks succeeded (Python 3.11+3.13 / Linux+macOS+Windows, Release smoke ×3 OS, Godot tests ×3 OS, Game capture smoke ×3 OS, plus Prepare/Agent/Upload/Cleanup)
  • Copilot review: 0 comments generated (3/3 files reviewed)
  • /simplify (3 sub-agents — reuse, quality, efficiency): no actionable findings in scope. The only non-blocking efficiency note (logger.warning rate-limit under attack) is bounded in the realistic case by the plugin's _check_state_changes running in _process() (≤60Hz) and was deliberately deferred as out-of-scope per CLAUDE.md ("Don't add features… beyond what the task requires"). Filing as a follow-up issue if log spam becomes observed.
  • /review: clean. Defaults preserve old event_data.get(default) semantics, try/except placement keeps unknown-event forward-compat working, validation parallels the existing HandshakeMessage / CommandResponse patterns. Codecov reports 100% coverage on modified lines.

No follow-up commits needed. Awaiting maintainer merge.

cc @dsarno


Generated by Claude Code

@dsarno dsarno merged commit d7a2efc into beta May 5, 2026
15 checks passed
dsarno added a commit that referenced this pull request May 6, 2026
51 commits forward from main, headlined by two architecture audits:

audit-v1 (issue #297, PRs #298-#315): scene-path ancestry guard,
update/config data-loss safeguards, lifecycle reliability,
characterization tests, plugin.gd extraction (McpPortResolver +
McpServerLifecycleManager), state-model cleanup, McpUpdateManager
extraction, Runtime Protocol deletion + DirectRuntime retype, narrowed
meta-tool JSON coercion, self-update preload-alias hardening, locked
FastMCP middleware order.

audit-v2 (issue #343, PRs #369-#390): origin allowlist (DNS-rebinding
guard, #345/#375), path-traversal guards on script_* / filesystem_*
writes (#347/#369), errno.EADDRINUSE portability across all OSes
(#348/#373), SessionRegistry RLock removal (#350/#370),
Pydantic-validated WebSocket event payloads (#351/#378),
sole-survivor auto-failover on active disconnect (#352/#379), 30s
filesystem_changed watchdog during update reload (#353/#381),
FAILED_MIXED self-update visibility via mixed_state in editor_state
(#354/#382), 32/tick inbound packet-drain cap with spillover logging
(#356/#383), error-code vocabulary enrichment (#365/#385:
NODE_NOT_FOUND, PROPERTY_NOT_ON_CLASS, VALUE_OUT_OF_RANGE,
MISSING_REQUIRED_PARAM cut INVALID_PARAMS sites 471 -> 97),
resolve-or-error helper extraction (#364/#389: 38+ duplicate sites
migrated), resource-form lint for meta-tool reads (#363/#386),
LogViewer + PortPickerPanel extraction from mcp_dock.gd (#360/#390),
LogViewer.tick() buffer-clear recovery (#392).

Conflict resolutions (both intentional, both took beta's side):

- plugin/addons/godot_ai/animation_handler.gd plus new submodules
  animation_presets.gd and animation_values.gd: beta's 4-domain split
  retained. Main's revert (#368) was CI-flake hygiene, not a structural
  rejection per its own commit message ("Empty commit to retrigger CI
  (flake on Godot tests / macOS)"). Beta's #367 is the canonical state
  the audit found wanted.

- plugin/addons/godot_ai/clients/_toml_strategy.gd: beta's version
  retained. Beta handles inline comments after section headers
  (`[next_section] # note`) via _is_any_section_header(); main's later
  re-derived bare-key fix uses the simpler bracket check and would
  regress on commented headers. The four main-only Windows / TOML
  hotfixes (#302, #318, #319, #320) all landed on beta independently
  under different commit hashes and are content-equivalent or
  beta-improved; no cherry-pick was needed before this merge.

Validation:

- CI: green on beta tip 72b35d7 (and on PR #392 separately) across
  ubuntu-latest / macos-latest / windows-latest for Python tests,
  Godot tests, release-smoke, and game-capture-smoke.
- Operator smoke (macOS, 2026-05-06): all 10 phases of the
  audit-v1+v2 post-landing runbook green. Report on issue #343.
  47 GDScript suites + 903 Python tests passing. Phase 7 interactive
  self-update verified end-to-end with the local-self-update-smoke
  harness; plugin advanced 2.3.2 -> 2.3.3, no .ips, no _exit_tree
  leak, server stop/start clean.
- Operator smoke (Windows 11, beta tip d5aa29f, 2026-05-06): Phase 7
  self-update green (7/7; macOS-only .ips check correctly downgraded
  to SKIP). Path-traversal guard rejects backslash variant. Cursor
  client configure cycle round-trips cleanly. editor_state ping 109ms.
  Pre-existing pytest UnicodeDecodeError on Windows tracked separately
  as #397 (also on main, not introduced by this merge).

# Conflicts:
#	test_project/tests/test_clients.gd
@dsarno dsarno deleted the claude/audit-v2-351-validate-ws-events branch May 7, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants