Skip to content

[Audit v2 #354] Surface FAILED_MIXED self-update state in editor_state + dock#382

Merged
dsarno merged 3 commits into
betafrom
claude/audit-v2-354-mixed-state-diagnostic
May 5, 2026
Merged

[Audit v2 #354] Surface FAILED_MIXED self-update state in editor_state + dock#382
dsarno merged 3 commits into
betafrom
claude/audit-v2-354-mixed-state-diagnostic

Conversation

@dsarno

@dsarno dsarno commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes audit-v2 finding #10 (#354): when self-update's runner reports InstallStatus.FAILED_MIXED (rollback couldn't restore the previous addon contents), the runner correctly refuses to re-enable the plugin — but the half-installed state was previously invisible to both MCP agents and the dock. Users saw "plugin won't start," re-ran the update, and compounded the mismatch.

This PR adds a structured "addon dir is in MIXED state, here are the .update_backup files" diagnostic in two places:

  • MCP editor_state — adds a mixed_state field with addon_dir, backup_files, backup_count, truncated, and message. The field is omitted entirely when the addons tree is clean, so agents distinguish "MIXED" vs "clean" on key presence and the normal payload stays small.
  • Dock banner — a separate red/amber banner (independent of the spawn-failure crash panel) listing the leftover *.update_backup paths plus a Re-scan button so the user can dismiss the banner after restoring the addon manually from VCS or a fresh release ZIP.

Design

Both surfaces consume the same scanner — plugin/addons/godot_ai/utils/update_mixed_state.gd:

  • Walks res://addons/godot_ai/ recursively for files ending in .update_backup.
  • Caps results at MAX_BACKUP_RESULTS = 200; on hit, sets truncated: true so an agent reading editor_state knows the list isn't exhaustive.
  • Returns an empty Dictionary when the tree is clean (callers gate on is_empty()).
  • Sorted output so two consecutive scans diff cleanly.
  • No marker file: .update_backup files on disk ARE the evidence. This is resilient to a runner crash that happens before a hypothetical marker write — and it works automatically across editor restarts (which is the realistic case here, since by the time FAILED_MIXED lands the plugin has already been disabled and the dock detached).

editor_handler.gd::get_editor_state calls UpdateMixedState.diagnose() and adds the mixed_state field only when non-empty. mcp_dock.gd builds a new banner during _build_ui and refreshes it once on dock construction (_refresh_mixed_state_banner), with the same helper bound to the Re-scan button.

The runner itself is unchanged. This PR is a pure read-side surface for a state the runner already produces correctly.

Test plan

  • ruff check src/ tests/ — clean
  • ruff format --check src/ tests/ — my new test file reformatted; everything else clean
  • pytest -q878 passed
  • script/ci-check-gdscript — All GDScript files OK
  • New Python tests in tests/unit/test_editor_state_mixed_state.py:
    • test_editor_state_omits_mixed_state_when_clean — clean addons tree → field absent
    • test_editor_state_passes_through_mixed_state_diagnostic — MIXED → every field surfaces unchanged (addon_dir, backup_files, backup_count, truncated, message)
    • test_editor_state_passes_through_truncated_mixed_statetruncated: true survives the passthrough
  • New GDScript tests in test_project/tests/test_update_mixed_state.gd:
    • test_find_backups_empty_when_dir_clean
    • test_find_backups_detects_top_level
    • test_find_backups_recurses_subdirectories — also asserts sorted output
    • test_find_backups_ignores_non_backup_files.uid, .tres, .cfg etc. don't match
    • test_find_backups_returns_empty_for_missing_dir — bare clone won't crash the scanner
    • test_diagnose_empty_when_clean
    • test_diagnose_carries_structured_fields — pins every key on the contract
    • test_find_backups_caps_results_at_max — pathological 1000s-of-backups scenario won't blow up editor_state

Interactive smoke (script/local-self-update-smoke) — NOT run from this session

Per CLAUDE.md, changes that touch mcp_dock.gd "update check/download/install paths" or update_reload_runner.gd trigger the interactive harness. This PR touches mcp_dock.gd only to add a separate banner (in _build_ui) that reads addons-tree state — none of the update check / download / install paths change. update_reload_runner.gd is untouched. The runner's FAILED_MIXED semantics are exactly as #297 / _rollback_paths_written left them.

That said, the interactive harness is not runnable from this CI session (requires a real Godot editor and manual click). Mitigations:

  1. The change is purely additive. Clean addons tree → scanner returns {} → banner stays hidden → editor_state payload unchanged. Existing happy-path smoke output is byte-identical to pre-PR.
  2. The MIXED-state branch is unit-tested in isolation. The eight GDScript tests drive the scanner against a scratch dir under user://, so we exercise every diagnostic field and the truncation path without needing to corrupt a real addons dir.
  3. The Python passthrough has its own coverage — three unit tests pin clean / MIXED / truncated.

Recommendation for @dsarno: please run script/local-self-update-smoke locally before merging if you want belt-and-suspenders verification. The new banner construction is on the happy path of _build_ui, but it consults disk via DirAccess — worth a sanity check that it doesn't add visible latency to the dock paint.

Deviations from "Fix shape"

None. The issue called for "a structured 'addon dir is in MIXED state, here are the .update_backup files' diagnostic in the dock crash panel and editor_state." Implemented as a separate banner (rather than mutating the existing _crash_panel) because the spawn-failure crash panel is keyed off ServerStateScript and is_terminal_diagnosis — orthogonal to addon-tree state. The MIXED banner can show concurrent with a healthy server, and a healthy server can run alongside MIXED-state addon files.

Cross-references

Closes #354
Umbrella: #343 (audit-v2 reliability sweep — adjacent to #353 / #345 / #349 / #351 / #352)

🤖 Generated with Claude Code


Generated by Claude Code

…e + dock

Closes audit-v2 finding #10 (#354): the runner correctly refuses to
re-enable the plugin when its rollback can't restore the previous
addons/godot_ai/ contents (`InstallStatus.FAILED_MIXED`), but the
half-installed state was previously invisible to both agents and the
dock — users saw "plugin won't start," re-ran the update, and
compounded the mismatch.

Adds a structured "addon dir is in MIXED state, here are the
.update_backup files" diagnostic surfaced in two places:

- `editor_state` MCP tool: `mixed_state` field (omitted when the addons
  tree is clean) carrying `addon_dir`, `backup_files`, `backup_count`,
  `truncated`, `message`. An MCP agent can read and report this.
- Dock: a red/amber banner above the update banner listing the leftover
  *.update_backup paths plus a Re-scan button that re-checks the addon
  dir after a manual restore.

Both surfaces consume the same scanner (`utils/update_mixed_state.gd`),
which walks `res://addons/godot_ai/` for `*.update_backup` files,
caps the result list at 200 entries (with `truncated: true` so the
agent knows when output is windowed), and returns an empty Dictionary
when the tree is clean. The scanner doesn't depend on a separate
marker file — `.update_backup` files on disk ARE the evidence, which
makes the surface resilient to runner crashes that happen before
any marker write.

Tests:
- `tests/unit/test_editor_state_mixed_state.py`: pins the Python
  passthrough (clean omits the field; non-clean surfaces every key;
  truncated flag survives).
- `test_project/tests/test_update_mixed_state.gd`: pins the scanner
  contract (clean dir, top-level + nested matches, sorted output,
  non-backup files ignored, missing dir doesn't crash, MAX cap and
  `truncated` flag).
Copilot AI review requested due to automatic review settings May 5, 2026 19:14
@codecov

codecov Bot commented May 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new read-side diagnostic for self-update FAILED_MIXED states so both MCP editor_state and the Godot dock can surface leftover .update_backup evidence when the addon tree is half-installed.

Changes:

  • Adds update_mixed_state.gd, a recursive scanner that finds .update_backup files and returns a structured diagnostic.
  • Extends editor_handler.gd::get_editor_state() to include mixed_state only when the addon tree is dirty.
  • Adds a new dock banner with a re-scan action, plus Python/GDScript tests for the scanner and Python passthrough.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit/test_editor_state_mixed_state.py Adds Python unit tests for editor_state mixed-state passthrough behavior.
test_project/tests/test_update_mixed_state.gd.uid Adds Godot UID sidecar for the new GDScript test file.
test_project/tests/test_update_mixed_state.gd Adds GDScript tests for backup discovery, recursion, truncation, and diagnose output.
plugin/addons/godot_ai/utils/update_mixed_state.gd.uid Adds Godot UID sidecar for the new scanner script.
plugin/addons/godot_ai/utils/update_mixed_state.gd Implements the mixed-state backup scanner and diagnostic payload builder.
plugin/addons/godot_ai/mcp_dock.gd Adds a dock banner that displays mixed-state diagnostics and supports re-scan.
plugin/addons/godot_ai/handlers/editor_handler.gd Surfaces the mixed-state diagnostic through get_editor_state().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +68 to +79
var backups := find_backups(dir)
if backups.is_empty():
return {}
return {
"addon_dir": dir,
"backup_files": backups,
"backup_count": backups.size(),
"truncated": backups.size() >= MAX_BACKUP_RESULTS,
"message": (
"Self-update rollback failed; addons/godot_ai/ contains a mix of"
+ " old and new files. Restore the addon from your VCS or a fresh"
+ " release ZIP, then delete the listed *.update_backup files."
Comment on lines +31 to +33
while not stack.is_empty():
if results.size() >= MAX_BACKUP_RESULTS:
break
Comment on lines +41 to +43
var mixed_state := UpdateMixedState.diagnose()
if not mixed_state.is_empty():
data["mixed_state"] = mixed_state
Comment on lines +443 to +444
_build_mixed_state_banner()
_refresh_mixed_state_banner()
claude added 2 commits May 5, 2026 19:27
Hot path: `editor_state` is one of the always-loaded core MCP tools
and agents poll it constantly. The recursive `DirAccess` walk added
in the previous commit ran on every poll. Add a 5-second TTL cache
keyed on the production `ADDON_DIR`; tests passing scratch dirs still
see fresh scans (cache only tracks the production path), and the
dock's Re-scan button passes `force=true` so a manual fix is reflected
immediately.

Walk determinism: when truncation kicks in, two scans of the same
mixed tree could return different 200-file slices because Godot's
`list_dir` order isn't guaranteed stable. Sort entries within each
directory and push subdirs reverse-sorted so DFS pops them in
ascending order. Result is deterministic across calls.

Message accuracy: the scanner can't distinguish FAILED_MIXED from
"successful install whose `_finalize_install_success` cleanup hit a
permission error" — both leave `.update_backup` files behind. Soften
the operator copy to acknowledge both cases; the recovery action
(delete the files) is the same.

Dock test coverage: extract `_apply_mixed_state_banner_diagnostic` as
a render seam so `test_dock.gd` can exercise the banner against
synthetic diagnostics without polluting `addons/godot_ai/` with
backup files. Adds four new dock tests pinning visibility,
re-rendering, dismissal, and truncation hint.

Reuse: alias `BACKUP_SUFFIX` from `UpdateReloadRunner.INSTALL_BACKUP_SUFFIX`
so the producer and the scanner can never drift on the suffix value.

Cleanup: drop the redundant `DirAccess.dir_exists_absolute` pre-check
(let `DirAccess.open(...)` returning null be the single failure path),
drop unnecessary `String(...)` coercions on values that are already
typed strings, and add `clear_cache()` for tests that flip state
between calls.
The previous commit's `_mixed_state_label` was a `RichTextLabel`. Its
`.text` property reflects the BBCode source, not the rendered content
appended via `add_text(...)` — so the dock test reading
`_mixed_state_label.text` got `""` even though the banner was rendering
correctly. CI's three Godot test jobs (Linux/macOS/Windows) failed on
`test_mixed_state_banner_renders_synthetic_diagnostic` and
`test_mixed_state_banner_renders_truncated_hint`.

Switch the message label to a plain `Label` (matches the existing
`_drift_label` pattern — one-shot single-line message, no scroll, no
selection beyond the file list). The file-list widget stays a
`RichTextLabel` because it needs the scroll region for long lists; the
test reads it via `get_parsed_text()` which returns the rendered string
content (not the BBCode source).

The previous commit's `String(...)` cast on the assignment is preserved
as a defensive type pin against `Dictionary.get()` returning Variant —
no functional change there, just kept as documented.
@dsarno dsarno merged commit a95a272 into beta May 5, 2026
11 checks passed
dsarno added a commit that referenced this pull request May 6, 2026
51 commits forward from main, headlined by two architecture audits:

audit-v1 (issue #297, PRs #298-#315): scene-path ancestry guard,
update/config data-loss safeguards, lifecycle reliability,
characterization tests, plugin.gd extraction (McpPortResolver +
McpServerLifecycleManager), state-model cleanup, McpUpdateManager
extraction, Runtime Protocol deletion + DirectRuntime retype, narrowed
meta-tool JSON coercion, self-update preload-alias hardening, locked
FastMCP middleware order.

audit-v2 (issue #343, PRs #369-#390): origin allowlist (DNS-rebinding
guard, #345/#375), path-traversal guards on script_* / filesystem_*
writes (#347/#369), errno.EADDRINUSE portability across all OSes
(#348/#373), SessionRegistry RLock removal (#350/#370),
Pydantic-validated WebSocket event payloads (#351/#378),
sole-survivor auto-failover on active disconnect (#352/#379), 30s
filesystem_changed watchdog during update reload (#353/#381),
FAILED_MIXED self-update visibility via mixed_state in editor_state
(#354/#382), 32/tick inbound packet-drain cap with spillover logging
(#356/#383), error-code vocabulary enrichment (#365/#385:
NODE_NOT_FOUND, PROPERTY_NOT_ON_CLASS, VALUE_OUT_OF_RANGE,
MISSING_REQUIRED_PARAM cut INVALID_PARAMS sites 471 -> 97),
resolve-or-error helper extraction (#364/#389: 38+ duplicate sites
migrated), resource-form lint for meta-tool reads (#363/#386),
LogViewer + PortPickerPanel extraction from mcp_dock.gd (#360/#390),
LogViewer.tick() buffer-clear recovery (#392).

Conflict resolutions (both intentional, both took beta's side):

- plugin/addons/godot_ai/animation_handler.gd plus new submodules
  animation_presets.gd and animation_values.gd: beta's 4-domain split
  retained. Main's revert (#368) was CI-flake hygiene, not a structural
  rejection per its own commit message ("Empty commit to retrigger CI
  (flake on Godot tests / macOS)"). Beta's #367 is the canonical state
  the audit found wanted.

- plugin/addons/godot_ai/clients/_toml_strategy.gd: beta's version
  retained. Beta handles inline comments after section headers
  (`[next_section] # note`) via _is_any_section_header(); main's later
  re-derived bare-key fix uses the simpler bracket check and would
  regress on commented headers. The four main-only Windows / TOML
  hotfixes (#302, #318, #319, #320) all landed on beta independently
  under different commit hashes and are content-equivalent or
  beta-improved; no cherry-pick was needed before this merge.

Validation:

- CI: green on beta tip 72b35d7 (and on PR #392 separately) across
  ubuntu-latest / macos-latest / windows-latest for Python tests,
  Godot tests, release-smoke, and game-capture-smoke.
- Operator smoke (macOS, 2026-05-06): all 10 phases of the
  audit-v1+v2 post-landing runbook green. Report on issue #343.
  47 GDScript suites + 903 Python tests passing. Phase 7 interactive
  self-update verified end-to-end with the local-self-update-smoke
  harness; plugin advanced 2.3.2 -> 2.3.3, no .ips, no _exit_tree
  leak, server stop/start clean.
- Operator smoke (Windows 11, beta tip d5aa29f, 2026-05-06): Phase 7
  self-update green (7/7; macOS-only .ips check correctly downgraded
  to SKIP). Path-traversal guard rejects backslash variant. Cursor
  client configure cycle round-trips cleanly. editor_state ping 109ms.
  Pre-existing pytest UnicodeDecodeError on Windows tracked separately
  as #397 (also on main, not introduced by this merge).

# Conflicts:
#	test_project/tests/test_clients.gd
@dsarno dsarno deleted the claude/audit-v2-354-mixed-state-diagnostic branch May 7, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants