Skip to content

Commit 8933203

Browse files
dsarnoclaude
andauthored
game_eval: bound eval with a deadline and always free the temp node (#488)
* game_eval: bound eval with a deadline and always free the temp node Evaluated GDScript that awaits something which never completes (a signal that never fires, a timer on a paused tree) hung _handle_eval's `await temp_node.execute()` with no escape hatch: the reply and queue_free() below were never reached, leaking the eval Node and pinning the request until the dispatcher's 15s deferred budget / the server's 15s command timeout fired it back as an opaque INTERNAL_ERROR. Telemetry shows this is ~19-20% of all game_eval calls, steady across 2.5.6-2.5.9 and all three platforms (500+ installs), with failed-call durations clustered at the 10-15s timeout thresholds vs ~40ms for successes. Drive execute() as a fire-and-forget coroutine recording into a holder, poll process_frame until done or an 8s deadline (comfortably under the 15s ceiling, with round-trip margin), and on timeout detach + free the node and reply with an actionable mcp:eval_error instead of letting the request ride to INTERNAL_ERROR. Does not cover CPU-bound loops with no await (they block the main thread, including this poll). Refs #487 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * game_eval: document timeout ordering + extract eval reply helpers Follow-up cleanups from a code review of the deadline guard, all behavior-preserving: - Document the load-bearing cross-file timeout ordering (game 8s < editor 10s < dispatcher 15s). Only the game-side EVAL_TIMEOUT_SEC guard emits the actionable "Eval exceeded 8s" message; the editor's fallback timer emits a generic one. The relationship lives in three separate files with nothing enforcing it, so a tweak to any one could silently swap the actionable diagnostic for the generic timeout. Note added at both EVAL_TIMEOUT_SEC and request_game_eval. - Note the accepted residual leak in _drive_eval: an await that never fires leaves the coroutine suspended and its (detached) node unfreed for the game-process lifetime. GDScript can't cancel a suspended coroutine; still strictly better than the pre-#487 in-tree leak. - Extract _reply_eval_error / _reply_eval_response to match the file's existing _reply_error / _reply_game_command_error convention, replacing four inline EngineDebugger.send_message call sites. Verified in a live editor (4.6.3): normal eval, sub-deadline await, and the 8s timeout path all return correctly through the new helpers; no node leak; headless --import is parse-clean. Refs #487 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 4b3f76d commit 8933203

2 files changed

Lines changed: 103 additions & 7 deletions

File tree

plugin/addons/godot_ai/debugger/mcp_debugger_plugin.gd

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,13 @@ func _clear_pending(request_id: String) -> void:
331331

332332
## --- game_eval: execute arbitrary GDScript in the running game ---
333333

334+
## Editor-side fallback timer for game_eval. MUST stay above the game-side
335+
## EVAL_TIMEOUT_SEC (8.0) in runtime/game_helper.gd and below the dispatcher's
336+
## game_eval budget (15000 ms) in dispatcher.gd — i.e. game 8s < editor 10s <
337+
## dispatcher 15s. This timer only fires when the game never replies at all,
338+
## and its message (the timeout_callable below) is intentionally generic. Drop
339+
## timeout_sec at/below 8s and it pre-empts the game's actionable "Eval
340+
## exceeded 8s" message — see the TIMEOUT ORDERING note on EVAL_TIMEOUT_SEC.
334341
func request_game_eval(
335342
code: String,
336343
request_id: String,

plugin/addons/godot_ai/runtime/game_helper.gd

Lines changed: 96 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -519,12 +519,38 @@ func _mouse_button_index(name: String) -> int:
519519

520520
## --- game_eval: execute arbitrary GDScript in the running game ---
521521

522+
## Wall-clock ceiling for a single game_eval. Evaluated code that awaits
523+
## something which never completes (a signal that never fires, a timer on a
524+
## paused tree) would otherwise pin the request open until the dispatcher's
525+
## 15s deferred budget / the server's 15s command timeout fires it as an
526+
## opaque INTERNAL_ERROR — with the temp eval Node leaked into the tree.
527+
## Bounding it here lets us free the node and reply with an actionable
528+
## message instead. See hi-godot/godot-ai#487.
529+
##
530+
## TIMEOUT ORDERING — load-bearing across three files: this value MUST stay
531+
## below the editor-side fallback timer in
532+
## `debugger/mcp_debugger_plugin.gd::request_game_eval` (`timeout_sec`,
533+
## default 10.0), which in turn stays below the dispatcher's `game_eval`
534+
## budget in `dispatcher.gd` (15000 ms). So: game 8s < editor 10s <
535+
## dispatcher 15s. Only this game-side guard emits the actionable
536+
## "Eval exceeded 8s" message; the editor timer emits a *generic* "Game eval
537+
## timed out" message. Raise this at/above the editor timer (or drop that
538+
## timer below this) and the generic message wins the race, silently losing
539+
## the diagnostic this fix exists to provide. Nothing enforces the order —
540+
## change one, re-check the other two.
541+
##
542+
## NOTE: this catches a hung `await`, not a CPU-bound loop with no `await` —
543+
## a tight `while true:` with no yield blocks the main thread, so nothing
544+
## (including this poll) runs until it yields. That case is out of scope.
545+
const EVAL_TIMEOUT_SEC := 8.0
546+
547+
522548
func _handle_eval(data: Array) -> void:
523549
var request_id: String = data[0] if data.size() > 0 else ""
524550
var code: String = data[1] if data.size() > 1 else ""
525551

526552
if code.is_empty():
527-
EngineDebugger.send_message("mcp:eval_error", [request_id, "No code provided"])
553+
_reply_eval_error(request_id, "No code provided")
528554
return
529555

530556
## Wrap user code so we can capture a return value.
@@ -543,22 +569,85 @@ func _handle_eval(data: Array) -> void:
543569
script.source_code = script_source
544570
var err: int = script.reload()
545571
if err != OK:
546-
EngineDebugger.send_message("mcp:eval_error",
547-
[request_id, "Failed to compile GDScript (error %d). Check syntax." % err])
572+
_reply_eval_error(request_id,
573+
"Failed to compile GDScript (error %d). Check syntax." % err)
548574
return
549575

550576
var temp_node := Node.new()
551577
temp_node.set_script(script)
552578
temp_node.process_mode = Node.PROCESS_MODE_ALWAYS
553579
add_child(temp_node)
554580

555-
var result = null
556-
if temp_node.has_method("execute"):
557-
result = await temp_node.execute()
581+
if not temp_node.has_method("execute"):
582+
temp_node.queue_free()
583+
_reply_eval_error(request_id, "Internal error: eval wrapper is missing execute().")
584+
return
585+
586+
## Drive execute() as a fire-and-forget coroutine that records its
587+
## outcome into `holder`, then poll frames until it finishes or the
588+
## deadline passes. A plain `await temp_node.execute()` has no escape
589+
## hatch: if user code never returns, we never reach the reply/cleanup
590+
## below and the request hangs with the node leaked.
591+
var holder := {"done": false, "value": null, "abandoned": false}
592+
_drive_eval(temp_node, holder)
593+
594+
var tree := get_tree()
595+
var deadline_ms := int(EVAL_TIMEOUT_SEC * 1000.0)
596+
var start_ms := Time.get_ticks_msec()
597+
## process_frame fires every idle frame regardless of tree pause, so this
598+
## deadline still elapses while the game is paused.
599+
while not holder["done"] and (Time.get_ticks_msec() - start_ms) < deadline_ms:
600+
await tree.process_frame
601+
602+
if not holder["done"]:
603+
## Still running past the deadline. Mark abandoned so a late
604+
## completion in _drive_eval drops its result and frees the node,
605+
## detach the node now so it stops touching the scene, and reply with
606+
## a clear timeout instead of letting the request ride to INTERNAL_ERROR.
607+
holder["abandoned"] = true
608+
if is_instance_valid(temp_node):
609+
remove_child(temp_node)
610+
_reply_eval_error(request_id,
611+
("Eval exceeded %ds and was aborted — the code likely awaits "
612+
+ "something that never completes (a signal that never fires, a timer on "
613+
+ "a paused tree) or loops forever. Check logs_read(source='game').")
614+
% int(EVAL_TIMEOUT_SEC))
615+
return
558616

559617
temp_node.queue_free()
618+
_reply_eval_response(request_id, holder["value"])
619+
620+
621+
## Run the compiled eval node's execute() and stash the result. Kept
622+
## separate from _handle_eval so the latter can race it against a deadline
623+
## via frame polling. If the eval was abandoned (timed out) before this
624+
## resumes, drop the result and free the now-detached node — _handle_eval
625+
## has already replied.
626+
##
627+
## RESIDUAL LEAK (accepted): if the awaited thing *never* fires, this
628+
## coroutine never resumes, so the `node` it holds is detached (via
629+
## _handle_eval's remove_child) but never freed — one orphaned Node per such
630+
## timeout, for the game-process lifetime. GDScript has no way to cancel a
631+
## suspended coroutine, so this is the best achievable in-process. It is still
632+
## strictly better than the pre-#487 behavior, where the node leaked *into*
633+
## the live tree and the request hung to the 15s ceiling.
634+
func _drive_eval(node: Node, holder: Dictionary) -> void:
635+
var value = await node.execute()
636+
if holder.get("abandoned", false):
637+
if is_instance_valid(node):
638+
node.queue_free()
639+
return
640+
holder["value"] = value
641+
holder["done"] = true
642+
643+
644+
func _reply_eval_error(request_id: String, message: String) -> void:
645+
EngineDebugger.send_message("mcp:eval_error", [request_id, message])
646+
647+
648+
func _reply_eval_response(request_id: String, value: Variant) -> void:
560649
EngineDebugger.send_message("mcp:eval_response",
561-
[request_id, JSON.stringify(_variant_to_json(result))])
650+
[request_id, JSON.stringify(_variant_to_json(value))])
562651

563652

564653
func _indent_eval_code(code: String) -> String:

0 commit comments

Comments
 (0)