Skip to content

Tighten 7 tests that passed without verifying behavior#458

Merged
dsarno merged 3 commits into
mainfrom
claude/review-test-quality-sFWOn
May 17, 2026
Merged

Tighten 7 tests that passed without verifying behavior#458
dsarno merged 3 commits into
mainfrom
claude/review-test-quality-sFWOn

Conversation

@dsarno

@dsarno dsarno commented May 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Audit of the test suites (Python + GDScript) surfaced 7 tests whose assertions did not actually pin down the behavior they claimed to test — they would have stayed green under regressions in the code paths they exercise. Five matched the "counts instead of stored Variants" / assert_has_key weakness called out in CLAUDE.md's test-hygiene checklist; two were Python-side cousins (overly broad pytest.raises(Exception) and a zero-assertion test).

GDScript fixes

  • test_theme.gd::test_theme_set_color_accepts_dict — only checked the response envelope. Now reloads the theme and asserts the stored value is a Color with the right channels, so a missing dict→Color coercion can't pass.
  • test_theme.gd::test_create_theme_overwrite_allowed and test_material.gd::test_create_overwrite_allowed — now assert overwritten=true on the response and that the target file still exists, instead of just confirming the call returned data.
  • test_material.gd::test_set_param_bool — now reads emission_enabled back off the loaded material (matching the read-back pattern its float and color siblings already use).
  • test_animation.gd::test_add_property_track_transition_raw_float — now reads the stored keyframes and asserts both that the dict values coerced to Vector3 and that the raw-float transition landed on each key. The prior version would pass even if either silently no-oped.

Python fixes

  • test_runtime_handlers.py::test_audio_player_create_blocks_when_not_writable — was catching bare Exception, which would mask unrelated bugs. Narrowed to GodotCommandError to match the surrounding readiness-gate tests.
  • test_self_update_smoke_harness.py::test_v240_preflight_passes_when_both_files_present — had zero assertions; it called the validator and returned. Added assert smoke._require_v240_plus_addon_shape(...) is None so the runner's zero-assertion guard has something to verify and the intent is documented.

Notes — wider patterns observed (not fixed here)

  • ~200 calls to assert_is_error(result) across the GDScript suite omit the expected error code (vs ~153 that supply one). That accepts any error as success, including unrelated bugs. Worth fixing in tests whose names claim a specific failure (e.g. test_create_node_non_node_type) but is too large to bundle into this PR.
  • The editor_undo(_undo_redo) calls flagged at test_node.gd:283, 298, 928 are cleanup-only undos with no post-undo assertion, so they don't violate the CLAUDE.md rule. Left as-is.
  • test_ui.gd:524/551/575/607 use the non-_override theme getters but are gated on has_theme_*_override(...) first, which guards against the silent-fallback failure mode. Godot 4.6 dropped the _override getter variants, per the inline comment. Left as-is.

Test plan

  • ruff check tests/unit/test_runtime_handlers.py tests/unit/test_self_update_smoke_harness.py — clean
  • pytest tests/unit/test_runtime_handlers.py tests/unit/test_self_update_smoke_harness.py -q — 263 passed
  • GDScript test runner (test_run via MCP) — to be exercised when CI picks up the branch / a reviewer runs the editor; the edits are syntactic-mirror copies of patterns already used elsewhere in the same files

https://claude.ai/code/session_01H8CLhwuNUA4NoEZ4neJk78


Generated by Claude Code

Audit of the GDScript + Python test suites surfaced seven tests whose
assertions didn't actually pin down the behavior they claimed to test —
they would have stayed green under regressions in the code paths they
exercise.

GDScript (test_project/tests/):
- test_theme.gd::test_theme_set_color_accepts_dict only checked the
  response envelope. Now reloads the theme and asserts the stored value
  is a Color with the right channels, so a missing dict->Color coercion
  can't pass.
- test_theme.gd::test_create_theme_overwrite_allowed and
  test_material.gd::test_create_overwrite_allowed now assert
  overwritten=true on the response and that the target file still
  exists, instead of just confirming the call returned data.
- test_material.gd::test_set_param_bool now reads emission_enabled back
  off the loaded material (matching the read-back pattern its float and
  color siblings already use).
- test_animation.gd::test_add_property_track_transition_raw_float now
  reads the stored keyframes and asserts both that the dict values
  coerced to Vector3 and that the raw-float transition landed on each
  key — the prior version would pass even if either silently no-oped.

Python (tests/unit/):
- test_runtime_handlers.py::test_audio_player_create_blocks_when_not_writable
  was catching bare Exception, which would mask unrelated bugs. Narrowed
  to GodotCommandError to match the surrounding readiness-gate tests.
- test_self_update_smoke_harness.py::test_v240_preflight_passes_when_both_files_present
  had zero assertions — it called the validator and returned. Added an
  assertion that the call returns None so the runner's zero-assertion
  guard has something to verify and the intent is documented.

https://claude.ai/code/session_01H8CLhwuNUA4NoEZ4neJk78
@codecov

codecov Bot commented May 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Linux + macOS Godot tests passed on the previous run; Windows failed in
~89s (vs ~134s on green main-branch runs), suggesting either a flake or
an early-pipeline failure. Pushing this empty commit to disambiguate.

https://claude.ai/code/session_01H8CLhwuNUA4NoEZ4neJk78

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves the reliability of the test suite by strengthening 7 tests (Python + GDScript) that previously could pass without actually validating the intended behavior, aligning assertions with real persisted/round-tripped outcomes where applicable.

Changes:

  • Add an explicit assertion to prevent a zero-assertion Python unit test from silently passing.
  • Narrow a Python exception expectation from a broad Exception to GodotCommandError for readiness-gated behavior.
  • Strengthen several GDScript tests by reading back stored resources/values and asserting the actual persisted (or runtime-stored) state, plus overwrite metadata.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/unit/test_self_update_smoke_harness.py Adds an assertion to ensure the preflight validator test actually verifies success.
tests/unit/test_runtime_handlers.py Narrows expected exception type for a non-writable-session gate to GodotCommandError.
test_project/tests/test_theme.gd Tightens overwrite and dict→Color behavior checks via additional assertions and resource read-back.
test_project/tests/test_material.gd Tightens overwrite and bool parameter persistence checks via resource read-back.
test_project/tests/test_animation.gd Verifies stored keyframe values and per-key transitions after adding a property track.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- test_animation.gd: add value assert for v0 so both keyframes carry the
  same depth of check (was: only v1 had a value assert; v0 was type-only).
- test_theme.gd: assert the alpha channel in
  test_theme_set_color_accepts_dict — the test dict explicitly passes
  a=1.0, so leaving it unchecked left a gap in the dict→Color coercion
  coverage.

https://claude.ai/code/session_01H8CLhwuNUA4NoEZ4neJk78
@dsarno dsarno merged commit bc36034 into main May 17, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants