Summary
When the opencode_cli supervisor's turn ends, pending inbox messages are never delivered. Multi-agent flows (assign → worker reply → supervisor) silently deadlock. Single-agent and pure handoff workflows are unaffected.
Tracked because docs/opencode-cli.md and the README provider table both flag opencode as experimental — single-agent flows only until this is fixed.
RCA (verified against code in PR #193)
Two compounding causes in services/inbox_service.py:
-
mtime doesn't advance post-settle. LogFileHandler is scheduled under watchdog.observers.polling.PollingObserver(timeout=INBOX_POLLING_INTERVAL=5), which scans TERMINAL_LOG_DIR every 5 s and emits on_modified only when a file's mtime changed since the previous scan. OpenCode's alt-screen TUI emits no pty bytes once the turn is idle → tmux pipe-pane writes nothing → log mtime freezes → handler never fires. DEBUG log shows four Log file modified events at 5 s intervals during the turn, then a 62 s gap with no events, then a single event immediately after a manual printf ' ' >> <log>.
-
Idle pattern absent from the pipe-pane byte stream. Even when the handler does fire, _has_idle_pattern(tail) (inbox_service.py:59) looks for ctrl+p\s+commands in the log tail and returns False. The opencode TUI emits that footer string once during the initial render — before pipe_pane() is attached (pipe_pane() runs after provider.initialize() returns) — and thereafter only emits incremental cursor/character updates.
Either cause alone blocks delivery; together they make it deterministic. check_and_send_pending_messages itself (inbox_service.py:105) uses provider.get_status() directly and is fine — the breakage is purely in the watchdog wake-up path. (Credit: @haofeif for that nuance.)
Reproduction
Run examples/assign/ against opencode_cli. Workers reply via send_message; messages land in DB as pending and are never delivered to the supervisor.
Proposed interim fix (~20 lines, provider-scoped)
Replace the log-mtime trigger for opencode_cli with a tmux capture-pane + provider.get_status() polling loop running at INBOX_POLLING_INTERVAL. Scoped via ProviderType so other providers keep their current event-driven behaviour and don't pay polling cost.
Related
Summary
When the opencode_cli supervisor's turn ends,
pendinginbox messages are never delivered. Multi-agent flows (assign → worker reply → supervisor) silently deadlock. Single-agent and pure handoff workflows are unaffected.Tracked because
docs/opencode-cli.mdand the README provider table both flag opencode as experimental — single-agent flows only until this is fixed.RCA (verified against code in PR #193)
Two compounding causes in
services/inbox_service.py:mtime doesn't advance post-settle.
LogFileHandleris scheduled underwatchdog.observers.polling.PollingObserver(timeout=INBOX_POLLING_INTERVAL=5), which scansTERMINAL_LOG_DIRevery 5 s and emitson_modifiedonly when a file's mtime changed since the previous scan. OpenCode's alt-screen TUI emits no pty bytes once the turn is idle →tmux pipe-panewrites nothing → log mtime freezes → handler never fires. DEBUG log shows fourLog file modifiedevents at 5 s intervals during the turn, then a 62 s gap with no events, then a single event immediately after a manualprintf ' ' >> <log>.Idle pattern absent from the pipe-pane byte stream. Even when the handler does fire,
_has_idle_pattern(tail)(inbox_service.py:59) looks forctrl+p\s+commandsin the log tail and returns False. The opencode TUI emits that footer string once during the initial render — beforepipe_pane()is attached (pipe_pane()runs afterprovider.initialize()returns) — and thereafter only emits incremental cursor/character updates.Either cause alone blocks delivery; together they make it deterministic.
check_and_send_pending_messagesitself (inbox_service.py:105) usesprovider.get_status()directly and is fine — the breakage is purely in the watchdog wake-up path. (Credit: @haofeif for that nuance.)Reproduction
Run
examples/assign/againstopencode_cli. Workers reply viasend_message; messages land in DB aspendingand are never delivered to the supervisor.Proposed interim fix (~20 lines, provider-scoped)
Replace the log-mtime trigger for opencode_cli with a
tmux capture-pane+provider.get_status()polling loop running atINBOX_POLLING_INTERVAL. Scoped viaProviderTypeso other providers keep their current event-driven behaviour and don't pay polling cost.Related