Skip to content

Commit 14734c7

Browse files
committed
feat: eager inbox delivery for providers that buffer input during processing
Add opt-in eager delivery mode (CAO_EAGER_INBOX_DELIVERY env var) that allows inbox messages to be delivered to terminals in PROCESSING or WAITING_USER_ANSWER states, eliminating the latency gap between agent turns for capable providers. Changes: - Add EAGER_INBOX_DELIVERY constant gated behind CAO_EAGER_INBOX_DELIVERY - Add accepts_input_while_processing property to BaseProvider (default False) - Override in ClaudeCodeProvider (returns True only after initialization) - Relax status gate in check_and_send_pending_messages() with two-flag check (env var + provider capability) - Skip idle-pattern pre-check in watchdog for eager-capable providers - Add native_agent thin-wrapper routing: missing CAO profiles fall back to --agent <name> for Claude Code's native agent store - Add 9 unit tests for eager delivery + native agent fallback test Misc Claude Code provider fixes: - Preserve CLAUDE_CODE_EFFORT_LEVEL through CAO's env-unset command - Fix response extraction false stops on ">" in content (Java generics, git diffs, HTML) by using start-of-line anchor (_SOL_IDLE_RE)
1 parent 1d8d710 commit 14734c7

11 files changed

Lines changed: 438 additions & 17 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,7 @@ Example: a supervisor assigns parallel data-analysis tasks to multiple analysts
223223
- Sends a message to a specific terminal's inbox; delivered when the terminal is idle
224224
- Enables ongoing collaboration and multi-turn conversations
225225
- Common in **swarm** operations
226+
- Supports [eager delivery](docs/inbox-delivery.md) for providers that buffer input during processing (eliminates inter-turn latency)
226227

227228
Example: multi-role feature development.
228229

docs/agent-profile.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Define the agent's role, responsibilities, and behavior here.
3333
- `toolsSettings` (object): Tool-specific configuration
3434
- `model` (string): AI model to use
3535
- `permissionMode` (string, `claude_code` only): One of `"default"`, `"acceptEdits"`, `"plan"`, `"auto"`, `"bypassPermissions"`. When set, the `claude_code` provider passes `--permission-mode <value>` instead of `--dangerously-skip-permissions`. `cao launch --yolo` overrides this and forces bypass. See [Claude Code permission modes](https://code.claude.com/docs/en/permission-modes).
36+
- `native_agent` (string, `claude_code` only): Name of a native Claude Code agent (`~/.claude/agents/`). When set, the provider passes `--agent <name>` directly and skips system prompt / MCP config decomposition (thin-wrapper mode). See [Claude Code native agent routing](claude-code.md#native-agent-routing).
3637
- `prompt` (string): Additional prompt text
3738

3839
## Tool Restrictions

docs/claude-code.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,27 @@ permissionMode: auto
113113
You review code for quality and correctness.
114114
```
115115

116+
## Eager Inbox Delivery
117+
118+
Claude Code's Ink TUI buffers pasted input even while the agent is processing. CAO exploits this to deliver queued inbox messages during PROCESSING and WAITING_USER_ANSWER states, eliminating inter-turn latency. Enable with `CAO_EAGER_INBOX_DELIVERY=true`.
119+
120+
See [Inbox Delivery](inbox-delivery.md) for the full architecture, two-flag gate, and how to enable this for other providers.
121+
122+
## Native Agent Routing
123+
124+
When a CAO profile specifies a `native_agent` field, the provider passes `--agent <name>` directly to Claude Code's native agent store (`~/.claude/agents/`). This is a thin-wrapper mode where Claude Code handles all configuration (MCP servers, hooks, tools, model).
125+
126+
If no CAO profile is found for the given agent name, the provider also falls back to `--agent <name>`, assuming it exists in the native store.
127+
128+
```markdown
129+
---
130+
name: my-wrapper
131+
description: Thin wrapper for a native Claude Code agent
132+
provider: claude_code
133+
native_agent: my-native-agent
134+
---
135+
```
136+
116137
## Implementation Notes
117138

118139
- **Prompt patterns**: `IDLE_PROMPT_PATTERN` matches both old `>` and new `` prompt styles, including non-breaking space (`\xa0`)

docs/inbox-delivery.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Inbox Delivery
2+
3+
## Overview
4+
5+
When an agent calls `send_message(terminal_id, message)`, the message is queued in the database and delivered to the target terminal's input area via bracketed paste. Delivery has two paths:
6+
7+
1. **Immediate**: the API endpoint attempts delivery right after persisting the message
8+
2. **Watchdog**: a `PollingObserver` (5s interval) monitors terminal log files for changes and attempts delivery when idle patterns are detected
9+
10+
Both paths converge on `check_and_send_pending_messages()`, which gates delivery based on terminal status.
11+
12+
## Standard Delivery
13+
14+
By default, messages are only delivered when the terminal status is **IDLE** or **COMPLETED**. This ensures the provider's TUI is ready to accept input and the message won't be lost or corrupt the terminal state.
15+
16+
## Eager Delivery
17+
18+
Some providers (e.g., Claude Code) have TUIs that buffer pasted input even while processing. For these providers, waiting for IDLE introduces unnecessary latency between agent turns.
19+
20+
Eager delivery allows messages to be delivered during **PROCESSING** and **WAITING_USER_ANSWER** states, eliminating the inter-turn gap.
21+
22+
### Enabling
23+
24+
Set the environment variable before starting the CAO server:
25+
26+
```bash
27+
export CAO_EAGER_INBOX_DELIVERY=true
28+
cao-server
29+
```
30+
31+
When disabled (default), delivery behavior is unchanged -- messages wait for IDLE or COMPLETED.
32+
33+
### Two-Flag Gate
34+
35+
Eager delivery requires both conditions to be true:
36+
37+
1. **Environment variable** (`CAO_EAGER_INBOX_DELIVERY=true`): global kill-switch for operators
38+
2. **Provider capability** (`accepts_input_while_processing = True`): per-provider opt-in
39+
40+
This prevents accidental delivery to providers whose TUIs would be corrupted by unsolicited input during processing.
41+
42+
### How the Watchdog Path Changes
43+
44+
Without eager delivery, the watchdog uses a fast `_has_idle_pattern()` check before attempting delivery. For eager-capable providers, this check is skipped (there is no idle pattern during PROCESSING), and the watchdog proceeds directly to `check_and_send_pending_messages()` where the full status gate applies.
45+
46+
### Provider Capability: `accepts_input_while_processing`
47+
48+
A property on `BaseProvider` (default `False`) that signals whether a provider's TUI safely buffers pasted input during processing. Override to `True` in providers that support this.
49+
50+
Currently enabled for:
51+
- **Claude Code** (`ClaudeCodeProvider`): Ink TUI buffers input at all times
52+
53+
Other providers that may support this (contributions welcome):
54+
- **Codex**: TUI-based, may buffer input
55+
- **OpenCode**: TUI-based, may buffer input
56+
57+
To enable for a new provider, override the property:
58+
59+
```python
60+
@property
61+
def accepts_input_while_processing(self) -> bool:
62+
"""This provider buffers pasted input during processing."""
63+
return self._initialized
64+
```
65+
66+
The `_initialized` gate is important -- it prevents delivery during startup when `get_status()` returns PROCESSING but the REPL isn't actually ready.
67+
68+
### Risks
69+
70+
| Risk | Likelihood | Mitigation |
71+
|------|-----------|------------|
72+
| Message delivered during PROCESSING gets lost (agent errors mid-turn) | Low | Message status is DELIVERED; acceptable for v1 |
73+
| Watchdog fires every 5s during long turns | Medium (bounded) | One DB query + one tmux call per interval; no amplification |
74+
| Feature causes regression in non-eager providers | None | Provider flag defaults to False; only opt-in providers affected |

src/cli_agent_orchestrator/constants.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,12 @@
6060
# Lower values = faster response, higher CPU usage
6161
INBOX_POLLING_INTERVAL = 5
6262

63+
# Eager inbox delivery: when enabled, deliver queued messages to terminals in
64+
# PROCESSING or WAITING_USER_ANSWER state for providers that declare
65+
# accepts_input_while_processing=True. Eliminates latency between agent turns
66+
# for capable providers (e.g., Claude Code).
67+
EAGER_INBOX_DELIVERY = os.environ.get("CAO_EAGER_INBOX_DELIVERY", "false").lower() == "true"
68+
6369
# =============================================================================
6470
# Cleanup Service Configuration
6571
# =============================================================================

src/cli_agent_orchestrator/models/agent_profile.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,4 @@ class AgentProfile(BaseModel):
3838
useLegacyMcpJson: Optional[bool] = None
3939
model: Optional[str] = None
4040
permissionMode: Optional[PermissionMode] = None
41+
native_agent: Optional[str] = None # Claude Code native agent name (thin-wrapper mode)

src/cli_agent_orchestrator/providers/base.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,19 @@ def get_idle_pattern_for_log(self) -> str:
131131
"""
132132
pass
133133

134+
@property
135+
def accepts_input_while_processing(self) -> bool:
136+
"""Whether this provider buffers pasted input during PROCESSING for next-turn pickup.
137+
138+
When True AND CAO_EAGER_INBOX_DELIVERY is enabled, the inbox service will
139+
deliver messages to this terminal even when its status is PROCESSING or
140+
WAITING_USER_ANSWER, rather than waiting for IDLE/COMPLETED.
141+
142+
Override in subclasses for providers whose TUI buffers input at all times
143+
(e.g., Claude Code's Ink renderer).
144+
"""
145+
return False
146+
134147
@property
135148
def extraction_retries(self) -> int:
136149
"""Number of extraction retries for transient TUI rendering issues.

src/cli_agent_orchestrator/providers/claude_code.py

Lines changed: 51 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,13 @@ def _build_claude_command(self) -> str:
7474
7575
Returns properly escaped shell command string that can be safely sent via tmux.
7676
Uses shlex.join() to handle multiline strings and special characters correctly.
77+
78+
Three routing paths based on agent profile state:
79+
1. Profile with native_agent field -> pass --agent <native_agent> directly
80+
(thin wrapper: Claude Code handles all config)
81+
2. No CAO profile found -> pass --agent <name> directly to Claude Code's
82+
native agent store (~/.claude/agents/)
83+
3. Full CAO profile -> decompose into CLI flags (model, prompt, MCP, etc.)
7784
"""
7885
# --dangerously-skip-permissions: bypass the workspace trust dialog and
7986
# tool permission prompts. CAO already confirms workspace access during
@@ -85,24 +92,38 @@ def _build_claude_command(self) -> str:
8592
if self._agent_profile is not None:
8693
try:
8794
profile = load_agent_profile(self._agent_profile)
95+
except FileNotFoundError:
96+
profile = None
8897
except Exception as e:
8998
raise ProviderError(f"Failed to load agent profile '{self._agent_profile}': {e}")
9099

100+
# Determine permission mode for the base command
91101
if profile and profile.permissionMode and not yolo:
92102
command_parts = ["claude", "--permission-mode", profile.permissionMode]
93103
else:
94104
command_parts = ["claude", "--dangerously-skip-permissions"]
95105

96-
if profile is not None:
106+
# Route based on profile state
107+
native = getattr(profile, "native_agent", None) if profile else None
108+
if profile is not None and isinstance(native, str) and native:
109+
# Thin wrapper: CAO profile maps to a native Claude Code agent.
110+
# Let Claude Code handle all config (MCP servers, hooks, tools, model).
111+
# CAO_TERMINAL_ID propagates via tmux pane env inheritance.
112+
command_parts.extend(["--agent", native])
113+
elif self._agent_profile is not None and profile is None:
114+
# No CAO profile exists — pass agent name directly to Claude Code's
115+
# native agent store (~/.claude/agents/). Same thin-orchestrator
116+
# pattern as the Kiro CLI provider.
117+
command_parts.extend(["--agent", self._agent_profile])
118+
elif profile is not None:
119+
# Full CAO profile with config decomposition
97120
if profile.model:
98121
command_parts.extend(["--model", profile.model])
99122

100123
# Add system prompt - escape newlines to prevent tmux chunking issues
101124
system_prompt = profile.system_prompt if profile.system_prompt is not None else ""
102125
system_prompt = self._apply_skill_prompt(system_prompt)
103126
if system_prompt:
104-
# Replace actual newlines with \n escape sequences
105-
# This prevents tmux send_keys chunking from breaking the command
106127
escaped_prompt = system_prompt.replace("\\", "\\\\").replace("\n", "\\n")
107128
command_parts.extend(["--append-system-prompt", escaped_prompt])
108129

@@ -144,13 +165,14 @@ def _build_claude_command(self) -> str:
144165
# When cao-server runs inside a Claude Code session, CLAUDE* env vars
145166
# leak into spawned tmux panes (via the tmux server's global env).
146167
# Claude Code detects these and refuses to start ("nested session").
147-
# Unset all matching vars except CLAUDE_CODE_USE_* and
168+
# Unset all matching vars except CLAUDE_CODE_USE_*,
148169
# CLAUDE_CODE_SKIP_*_AUTH (needed for provider authentication:
149-
# Bedrock, Vertex AI, Foundry).
170+
# Bedrock, Vertex AI, Foundry), and CLAUDE_CODE_EFFORT_LEVEL (user pref).
150171
unset_cmd = (
151172
"unset $(env | sed -n 's/^\\(CLAUDE[A-Z_]*\\)=.*/\\1/p'"
152173
" | grep -v -E 'CLAUDE_CODE_USE_(BEDROCK|VERTEX|FOUNDRY)"
153-
"|CLAUDE_CODE_SKIP_(BEDROCK|VERTEX|FOUNDRY)_AUTH'"
174+
"|CLAUDE_CODE_SKIP_(BEDROCK|VERTEX|FOUNDRY)_AUTH"
175+
"|CLAUDE_CODE_EFFORT_LEVEL'"
154176
") 2>/dev/null"
155177
)
156178
return f"{unset_cmd}; {claude_cmd}"
@@ -390,10 +412,24 @@ def get_status(self, tail_lines: Optional[int] = None) -> TerminalStatus:
390412

391413
return TerminalStatus.ERROR
392414

415+
@property
416+
def accepts_input_while_processing(self) -> bool:
417+
"""Claude Code's Ink TUI buffers pasted input during processing.
418+
419+
Only true after initialization completes — during startup the REPL
420+
isn't ready to accept input even though get_status() sees PROCESSING.
421+
"""
422+
return self._initialized
423+
393424
def get_idle_pattern_for_log(self) -> str:
394425
"""Return Claude Code IDLE prompt pattern for log files."""
395426
return IDLE_PROMPT_PATTERN_LOG
396427

428+
# Start-of-line idle prompt for extraction: ❯ or > at the beginning of a line
429+
# (after optional ANSI codes). Mid-line ">" in Java generics, git diffs, HTML
430+
# etc. must NOT trigger the stop condition.
431+
_SOL_IDLE_RE = re.compile(r"^\s*(?:\x1b\[[0-9;]*m)*[>❯](?:\x1b\[[0-9;]*m)*[\s\xa0]")
432+
397433
def extract_last_message_from_script(self, script_output: str) -> str:
398434
"""Extract Claude's final response message using ⏺ indicator."""
399435
# Find all matches of response pattern
@@ -406,20 +442,24 @@ def extract_last_message_from_script(self, script_output: str) -> str:
406442
last_match = matches[-1]
407443
start_pos = last_match.end()
408444

409-
# Extract everything after the last ⏺ until next prompt or separator
445+
# Extract everything after the last ⏺ until:
446+
# 1. A start-of-line idle prompt (❯ or >) — the definitive boundary
447+
# 2. A completion stat line ("✻ Sautéed for 14s") — trims the stat
448+
# Using start-of-line anchor avoids false stops on ">" inside
449+
# response content (Java generics, git diffs, HTML tags, etc.).
410450
remaining_text = script_output[start_pos:]
411451

412452
# Split by lines and extract response
413453
lines = remaining_text.split("\n")
414454
response_lines = []
415455

416456
for line in lines:
417-
# Stop at next > prompt or separator line
418-
if re.match(r">\s", line) or "────────" in line:
457+
clean_line = re.sub(ANSI_CODE_PATTERN, "", line).strip()
458+
if self._SOL_IDLE_RE.match(line):
459+
break
460+
if "────────" in line:
419461
break
420462

421-
# Clean the line
422-
clean_line = line.strip()
423463
response_lines.append(clean_line)
424464

425465
if not response_lines or not any(line.strip() for line in response_lines):

src/cli_agent_orchestrator/services/inbox_service.py

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
list_pending_receiver_ids_by_provider,
3434
update_message_status,
3535
)
36-
from cli_agent_orchestrator.constants import TERMINAL_LOG_DIR
36+
from cli_agent_orchestrator.constants import EAGER_INBOX_DELIVERY, TERMINAL_LOG_DIR
3737
from cli_agent_orchestrator.models.inbox import MessageStatus, OrchestrationType
3838
from cli_agent_orchestrator.models.provider import ProviderType
3939
from cli_agent_orchestrator.models.terminal import TerminalStatus
@@ -109,7 +109,13 @@ def check_and_send_pending_messages(
109109
# the idle prompt was never found, so messages stayed PENDING forever.
110110
status = provider.get_status()
111111

112-
if status not in (TerminalStatus.IDLE, TerminalStatus.COMPLETED):
112+
eager_eligible = (
113+
EAGER_INBOX_DELIVERY
114+
and provider.accepts_input_while_processing
115+
and status in (TerminalStatus.PROCESSING, TerminalStatus.WAITING_USER_ANSWER)
116+
)
117+
118+
if status not in (TerminalStatus.IDLE, TerminalStatus.COMPLETED) and not eager_eligible:
113119
logger.debug(f"Terminal {terminal_id} not ready (status={status})")
114120
return False
115121

@@ -182,7 +188,18 @@ def _handle_log_change(self, terminal_id: str):
182188
return
183189

184190
# Fast check: does log tail have idle pattern?
185-
if not _has_idle_pattern(terminal_id):
191+
# Skip for eager-delivery-capable providers — they have no idle pattern
192+
# during PROCESSING but can still accept input.
193+
skip_idle_check = False
194+
if EAGER_INBOX_DELIVERY:
195+
try:
196+
provider = provider_manager.get_provider(terminal_id)
197+
if provider and provider.accepts_input_while_processing:
198+
skip_idle_check = True
199+
except Exception as e:
200+
logger.debug(f"Eager delivery check failed for {terminal_id}: {e}")
201+
202+
if not skip_idle_check and not _has_idle_pattern(terminal_id):
186203
logger.debug(
187204
f"Terminal {terminal_id} not idle (no idle pattern in log tail), skipping"
188205
)

0 commit comments

Comments
 (0)