Skip to content

fix(opencode): post-settle inbox messages stuck in PENDING state #203

@patricka3125

Description

@patricka3125

Summary

When the opencode_cli supervisor's turn ends, pending inbox messages are never delivered. Multi-agent flows (assign → worker reply → supervisor) silently deadlock. Single-agent and pure handoff workflows are unaffected.

Tracked because docs/opencode-cli.md and the README provider table both flag opencode as experimental — single-agent flows only until this is fixed.

RCA (verified against code in PR #193)

Two compounding causes in services/inbox_service.py:

  1. mtime doesn't advance post-settle. LogFileHandler is scheduled under watchdog.observers.polling.PollingObserver(timeout=INBOX_POLLING_INTERVAL=5), which scans TERMINAL_LOG_DIR every 5 s and emits on_modified only when a file's mtime changed since the previous scan. OpenCode's alt-screen TUI emits no pty bytes once the turn is idle → tmux pipe-pane writes nothing → log mtime freezes → handler never fires. DEBUG log shows four Log file modified events at 5 s intervals during the turn, then a 62 s gap with no events, then a single event immediately after a manual printf ' ' >> <log>.

  2. Idle pattern absent from the pipe-pane byte stream. Even when the handler does fire, _has_idle_pattern(tail) (inbox_service.py:59) looks for ctrl+p\s+commands in the log tail and returns False. The opencode TUI emits that footer string once during the initial render — before pipe_pane() is attached (pipe_pane() runs after provider.initialize() returns) — and thereafter only emits incremental cursor/character updates.

Either cause alone blocks delivery; together they make it deterministic. check_and_send_pending_messages itself (inbox_service.py:105) uses provider.get_status() directly and is fine — the breakage is purely in the watchdog wake-up path. (Credit: @haofeif for that nuance.)

Reproduction

Run examples/assign/ against opencode_cli. Workers reply via send_message; messages land in DB as pending and are never delivered to the supervisor.

Proposed interim fix (~20 lines, provider-scoped)

Replace the log-mtime trigger for opencode_cli with a tmux capture-pane + provider.get_status() polling loop running at INBOX_POLLING_INTERVAL. Scoped via ProviderType so other providers keep their current event-driven behaviour and don't pay polling cost.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions