Add comprehensive test suite for Q CLI provider#16
Merged
Conversation
tuanknguyen
suggested changes
Oct 31, 2025
tuanknguyen
left a comment
Contributor
There was a problem hiding this comment.
Great addition overall! The main change I request is to follow the uv convention throughout to stay consistent with the repo. Another ask is to add the example GitHub Actions you had as an actual yml file in .github/workflows (with a few small modifications)
Addresses all review feedback from PR awslabs#16
haofeif
added a commit
that referenced
this pull request
Feb 9, 2026
…t stabilization Three bugs fixed in Gemini CLI provider (lessons #16, #17, #18): 1. Ink TUI keeps idle prompt visible during processing — added PROCESSING_SPINNER_PATTERN (Braille dots + "esc to cancel") check in get_status() to avoid premature COMPLETED when MCP tools are still executing. 2. Premature COMPLETED between text output and MCP tool call — replaced sequential wait_for_status + terminal count check with combined _wait_for_supervisor_done() polling in supervisor E2E tests. 3. Ink TUI shows idle prompt before -i prompt is processed — added _uses_prompt_interactive flag so initialize() waits for COMPLETED (not IDLE) when -i is used, preventing lost messages. E2E test improvements across all providers: - Accept both "idle" and "completed" as valid post-initialization states - Add stabilization delay after COMPLETED detection in handoff/assign tests - Add missing get_terminal_status import in test_handoff.py E2E results: 13/14 pass (7/7 Gemini, 6/7 Codex — 1 pre-existing Codex supervisor_assign output extraction failure). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
haofeif
added a commit
that referenced
this pull request
Feb 9, 2026
…t stabilization Three bugs fixed in Gemini CLI provider (lessons #16, #17, #18): 1. Ink TUI keeps idle prompt visible during processing — added PROCESSING_SPINNER_PATTERN (Braille dots + "esc to cancel") check in get_status() to avoid premature COMPLETED when MCP tools are still executing. 2. Premature COMPLETED between text output and MCP tool call — replaced sequential wait_for_status + terminal count check with combined _wait_for_supervisor_done() polling in supervisor E2E tests. 3. Ink TUI shows idle prompt before -i prompt is processed — added _uses_prompt_interactive flag so initialize() waits for COMPLETED (not IDLE) when -i is used, preventing lost messages. E2E test improvements across all providers: - Accept both "idle" and "completed" as valid post-initialization states - Add stabilization delay after COMPLETED detection in handoff/assign tests - Add missing get_terminal_status import in test_handoff.py E2E results: 13/14 pass (7/7 Gemini, 6/7 Codex — 1 pre-existing Codex supervisor_assign output extraction failure). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
haofeif
added a commit
that referenced
this pull request
Apr 24, 2026
* feat(opencode): Phase 1 — foundation primitives for OpenCode CLI provider
- Add ProviderType.OPENCODE_CLI = "opencode_cli" to the provider enum
- Add OPENCODE_CONFIG_DIR / OPENCODE_AGENTS_DIR / OPENCODE_CONFIG_FILE path
constants pointing at ~/.aws/opencode_cli/
- New OpenCodeAgentConfig Pydantic model (description, mode, permission) that
serializes to OpenCode-compatible YAML frontmatter via frontmatter.dumps()
- New cao_tools_to_opencode_permission() translator: two-step algorithm from §9
of the design doc (shorthand expansion + CAO-category → OpenCode tool mapping +
hardcoded non-vocabulary deny/allow policies)
- New opencode_config.py read-modify-write helper for the shared opencode.json
(upsert_mcp_server, upsert_agent_tools, remove_agent_tools, read_config,
write_config)
- Port 5 TUI probe captures into test/providers/fixtures/ (plain + ANSI variants
for idle-splash, idle-post-completion, processing, completed, permission states)
- 54 new unit tests covering all Phase 1 modules; all 1368 tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(opencode): Phase 1 review items — ANSI fixture, model cleanup, guard-rail
Item 1: Replace opencode_cli_processing.ansi.txt with a genuine PROCESSING frame
re-captured via tmux probe (md5 9cbe2723, distinct from completed frame).
Add test/providers/fixtures/OPENCODE_FIXTURES.md documenting all fixture
sources and the remaining idle_post_completion.ansi.txt reuse.
Item 2: Remove dead Pydantic v1 `class Config: exclude_none = True` block from
OpenCodeAgentConfig — it is a no-op under Pydantic v2.
Item 3: Add inline comment to OpenCodeAgentConfig.permission documenting the
deliberate Phase 1 type simplification and when to widen it.
Item 4: Replace unreachable `else: result[tool] = "deny"` in
opencode_permissions.py with `raise AssertionError(...)` so any future
tool added to ALL_OPENCODE_TOOLS without a policy update fails loudly.
Item 5: Add test_noop_on_completely_missing_file to TestRemoveAgentTools —
exercises the read_config() skeleton-return path when opencode.json
does not exist yet.
All 1369 tests pass; mypy/black/isort clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(install): add opencode_cli provider branch with --auto-approve flag
Adds the `opencode_cli` branch to `cao install` (Phase 2):
- writes agent `<name>.md` with YAML frontmatter (description, mode, permission)
using compose_agent_prompt for the body and cao_tools_to_opencode_permission
for the per-tool allow/ask/deny map
- `--auto-approve` flag emits `allow` instead of `ask` for permitted tools;
has no effect on other providers
- if the agent profile declares mcpServers, upserts top-level mcp/tools entries
(default-deny) and per-agent tool re-enables into opencode.json
- full unit-test coverage in test/cli/commands/test_install_opencode.py
(fresh install, idempotency, auto-approve, MCP wiring, config preservation,
safe filename)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(install): apply Phase 2 review polish
- Rename agent_config_oc → agent_config in opencode_cli branch for
consistency with Kiro/Q/Copilot sibling branches (Item 1)
- Strengthen test_agent_md_has_body: assert sentinel prompt text via
profile.prompt frontmatter field instead of weak non-empty check (Item 2)
- Bump live smoke-test subprocess timeout 30s → 60s to survive cold-cache
npm plugin installs on CI (Item 4)
Items 3 (MCP collision coverage already in Phase 1) and 5 (context-file
parent mkdir — out of Phase 2 scope) intentionally not addressed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(providers): add OpenCodeCliProvider (Phase 3)
Implements the opencode_cli runtime provider per §8 of the design doc:
- OpenCodeCliProvider with full BaseProvider interface (initialize,
get_status, extract_last_message_from_script, exit_cli, cleanup)
- 5-state detection (IDLE/PROCESSING/COMPLETED/WAITING_USER_ANSWER/ERROR)
with line-level position guard against stale alt-screen esc-interrupt
remnants (lesson #16)
- COMPLETED vs IDLE-post-completion distinguished by checking for a
subsequent ▣ token after the last full completion marker
- 120s initialize() timeout for first-run npm install cold-start (§8.2)
- Inline-env launch command with all stability env vars (§5)
- --model flag included only when profile.model is set (§3.1 exception)
- Registered in ProviderManager; "opencode_cli" added to
PROVIDERS_REQUIRING_WORKSPACE_ACCESS in launch.py
- 43 unit tests at 96% line coverage against Phase 1 fixtures
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: add Phase 3 OpenCode provider runtime development walkthrough report
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(opencode): Phase 3 review polish — correct report line count, add dual-pattern comment
Item 1: development report corrected 125 → 332 lines for opencode_cli.py.
Item 4: inline comment at extract_last_message_from_script explains why the
unanchored r"┃\s{2}" is used instead of the module-level USER_MESSAGE_PATTERN.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(opencode): Phase 4 — e2e test, provider docs, README/CHANGELOG
- test/e2e/conftest.py: add require_opencode fixture (skips if opencode not on PATH)
- test/e2e/test_assign.py: add TestOpenCodeCliAssign with data_analyst, report_generator,
and assign_with_callback tests covering all four orchestration modes
- docs/opencode-cli.md: new provider doc covering prerequisites, quick start, config
isolation, permission/tool mapping, MCP wiring, known limitations, troubleshooting
- README.md: add opencode_cli row to provider table + cao launch example
- CHANGELOG.md: add Unreleased entry announcing OpenCode CLI provider
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: add Phase 4 OpenCode e2e and docs development walkthrough report
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(opencode): translate CAO mcpServer format to OpenCode's opencode.json format (Phase 3 regression)
CAO profiles store MCP servers with {type: "stdio", command: str, args: list}.
OpenCode's opencode.json requires {type: "local", command: list, enabled: true}.
The install branch was passing raw CAO config directly, causing OpenCode to reject
the config with "Configuration is invalid: Invalid input mcp.cao-mcp-server".
Fix: add translate_mcp_server_config() to opencode_config.py and call it in the
opencode_cli install branch before upsert_mcp_server(). Also translates env→environment.
6 unit tests added for the translator.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(opencode): add --yolo DANGEROUS caveat to permission troubleshooting (Phase 4 review polish)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: update Phase 4 report with e2e results and Phase 3 regression fix notes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(opencode): native skill discovery via OPENCODE_CONFIG_DIR/skills symlink
At install time, create a skills → SKILLS_DIR symlink under OPENCODE_CONFIG_DIR
so OpenCode auto-discovers CAO skills through its native skill tool (§5.1). Uses
profile.system_prompt or profile.prompt as the lean agent body — the skill catalog
is no longer baked into the OpenCode system prompt.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(opencode): handle Nm Ns duration format and extend extraction buffer
Two status/extraction bugs revealed by e2e runs with full system prompts:
1. COMPLETION_MARKER_PATTERN now matches the Nm Ns duration format that
OpenCode emits for responses that take more than 60 seconds (e.g.
"1m 8s"). The old pattern only matched the pure-seconds form, causing
get_status() to stall at PROCESSING indefinitely for longer turns.
2. Add extraction_tail_lines property to BaseProvider (default None) and
override to 2000 in OpenCodeCliProvider. terminal_service.get_output
uses this value for the LAST-mode tmux capture so long responses don't
push the user-message marker (┃ ) beyond the 200-line default window.
Status-check captures are unaffected.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(opencode): single capture-pane per get_output call; add wrong-target symlink test
Item 2: Eliminate double capture-pane in get_output(mode=LAST). Previously
the function always captured at 200 lines then recaptured if the provider
declared extraction_tail_lines. Now FULL mode returns after a single capture
at the default depth; LAST mode resolves extract_lines from the provider once
and makes exactly one capture before the retry loop.
Item 1: Add test_warns_and_skips_when_symlink_points_elsewhere to
TestEnsureSkillsSymlink, covering the branch at opencode_config.py:37-42
where the target is a symlink that resolves to a different directory.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(opencode): rename on-disk config directory from opencode_cli to opencode
Change OPENCODE_CONFIG_DIR from ~/.aws/opencode_cli to ~/.aws/opencode in
constants.py; OPENCODE_AGENTS_DIR and OPENCODE_CONFIG_FILE update transitively.
Update all path string references in docs, CHANGELOG, and the constants unit test.
Provider identifier (ProviderType.OPENCODE_CLI.value == "opencode_cli") is unchanged.
Add CHANGELOG migration note for users who need to re-run cao install.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(opencode): fall back to first agent-indented line when user message scrolled off viewport
OpenCode renders in alt-screen mode so the tmux scrollback only holds the current
visible frame (~41 lines, history_size≈2). For long responses the user-message bar
(┃ ) scrolls off the top before extraction runs, causing "No user message found".
When no ┃ is found before the completion marker, scan for the first 5-space-indented
agent line as the left boundary instead of raising. The visible frame already contains
only the current turn's content, so multi-turn disambiguation is not needed here.
Adds unit test test_fallback_extracts_when_user_message_scrolled_off.
e2e: 3/3 PASSED in 161s on port 9888.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(terminal_service): guard build_skill_catalog() call and update skill-delivery comments
Cleanup 1: build_skill_catalog() now runs only when the provider is in
RUNTIME_SKILL_PROMPT_PROVIDERS, skipping the file reads/YAML parsing/Pydantic
validation for providers that deliver skills natively (OpenCode symlink, Kiro
skill:// resources) or via install-time baking (Q, Copilot). The skill_prompt
kwarg at the create_provider call site simplifies to skill_prompt=skill_prompt
since the guard now lives one line above.
Cleanup 2: update comments in the RUNTIME_SKILL_PROMPT_PROVIDERS block and
create_terminal Steps 3b/4 to reflect Phase 5's native OpenCode skill discovery.
Adds two new tests asserting the lazy-call invariant:
- test_build_skill_catalog_called_for_runtime_prompt_provider (call_count == 1)
- test_build_skill_catalog_not_called_for_native_or_baked_provider (parametrized
over opencode_cli, kiro_cli, q_cli, copilot_cli; assert_not_called)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs: update Phase 6 dev report with alt-screen extraction fix and cleanup commits
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(opencode): correct extraction_tail_lines docstring + black format + cleanup polish
Item 1: black reformats terminal_service.py line 155 from a three-line expression
to the single 100-char form black prefers.
Item 2: rewrite extraction_tail_lines docstring — the old text claimed responses
push ┃ beyond a 200-line window, which is wrong; OpenCode's alt-screen mode caps
history_size near 2 making the override a no-op. Docstring now accurately describes
the belt-and-braces rationale and cross-references the within-viewport fallback.
Item 3: add single-turn alt-screen assumption comment to the normal extraction path.
Item 4: CHANGELOG migration note gains a rm -rf cleanup hint for pre-release users.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(opencode): correct OPENCODE_DISABLE_MOUSE rationale and add UX callout
- Update design doc stability-env-vars entry: footer patterns (ctrl+p,
esc interrupt) are pinned and scroll-safe; the completion marker
(▣ agent · model · Ns) is conversation content and scrolls off,
preventing COMPLETED detection if mouse reporting is enabled
- Add 'Scrolling enters tmux copy mode' Known Limitations entry in
opencode-cli.md explaining the trade-off and how to work around it
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* untrack out of scope docs
* fix(opencode): align auto-approve, agent ID, and MCP cleanup semantics
Addresses PR 193 review feedback.
- Drop `cao install --auto-approve` and the `auto_approve` arg on the permission
translator. CAO owns the permission decision, so `cao_tools_to_opencode_permission`
now emits only `allow` or `deny`. `cao launch --auto-approve` retains its
repo-wide meaning (skip CAO's confirmation prompt) and no longer has a
provider-specific reinterpretation.
- Remove stale `agent.<id>.tools` entries when a profile is reinstalled without
`mcpServers` so revoked MCP grants do not persist.
- Introduce `to_opencode_agent_id()` and use it consistently for the installed
`.md` filename and the `agent.<id>.tools` key in `opencode.json`, keeping both
aligned with the runtime `opencode --agent` argument for slash-containing
profile names.
- Strip phase numbers and design-doc section references from shipped source
files and the provider doc per reviewer request.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(opencode): drop reference to removed design doc from changelog
Addresses review comment: the design doc link in the Unreleased entry
referred to a file that is not included in this PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(opencode): scope extraction_tail_lines to OpenCodeCliProvider
The property was opencode-specific but lived on BaseProvider, which meant
every provider carried an attribute it had no use for. Remove it from the
base class, keep it as a provider-local property on OpenCodeCliProvider,
and have terminal_service.get_output read it via a getattr capability
check so the base class stays agnostic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(opencode): close codecov gaps in cli provider and permission translator
Adds four tests to cover lines flagged as missing on the codecov report:
- get_status ERROR fallback for non-empty output with no recognized marker
- extract_last_message_from_script residual ``┃`` line + blank-line
branches when raw_response contains leftover bar-prefixed lines
- extract_last_message_from_script empty-response ValueError
- cao_tools_to_opencode_permission AssertionError when a tool appears in
ALL_OPENCODE_TOOLS without a matching policy
Brings opencode_cli.py and opencode_permissions.py to 100% patch coverage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(opencode): mark provider experimental — single-agent flows only
Add a warning badge to docs/opencode-cli.md and tag the README provider
table row, both linking the post-settle inbox-delivery deadlock tracked
in #203. Multi-agent flows are not yet reliable on opencode_cli; this
signals the constraint to evaluators before they hit it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: haofeif <56006724+haofeif@users.noreply.github.com>
haofeif
pushed a commit
that referenced
this pull request
Jun 16, 2026
* feat(providers): add Cursor CLI as a first-class provider Adds support for the Cursor CLI (agent, https://cursor.com/cli) so it can be orchestrated alongside Claude Code, Kiro CLI, Codex, and the other providers already supported by CAO. Resolves issue #264. The provider is built on the post-event-driven-architecture (post-#273) provider API: async initialize(), get_status(output) that takes the StatusMonitor buffer string directly, and get_backend() for tmux I/O. What the provider does: - Launches the interactive 'agent' REPL (the primary command per Cursor's official docs; cursor-agent is the historical alias and resolves to the same binary). --print is intentionally NOT used so the inbox service can stream follow-up prompts via MCP handoff. - Forwards the agent profile's system prompt via --system-prompt (newlines escaped for tmux compatibility), with the skill catalog appended. - Forwards profile.mcpServers via --mcp <json>, injecting CAO_TERMINAL_ID into each server's env so MCP tools can identify the current terminal. - Honors profile.model via --model (overridable via the constructor). - Bypasses the per-tool approval dialog with --force, the workspace-trust dialog with --trust, and the per-MCP-server approval dialog with --approve-mcps, so worker agents spawned via handoff/assign do not block. - Soft tool-restriction enforcement via SECURITY_PROMPT prepended to the system prompt when allowedTools is set (Cursor CLI does not yet expose a --disallowedTools equivalent). Status detection (mirrors the Claude Code provider's robust pattern): - Structural spinner-before-separator check for PROCESSING. - Fallback position-based spinner check before the first separator. - Idle / trust / permission prompts distinguished by pattern priority. - Message extraction uses the structural separator + trailing prompt pattern, since Cursor CLI does not emit a single canonical response marker like Claude Code's \u23fa. Files: - src/cli_agent_orchestrator/providers/cursor_cli.py: new CursorCliProvider (BaseProvider implementation, post-#273 API). - src/cli_agent_orchestrator/providers/manager.py: register cursor_cli. - src/cli_agent_orchestrator/models/provider.py: add CURSOR_CLI to enum. - src/cli_agent_orchestrator/cli/commands/launch.py: add cursor_cli to PROVIDERS_REQUIRING_WORKSPACE_ACCESS. - test/providers/test_cursor_cli_unit.py: 53 unit tests (regex patterns, get_status, extract, build command, async initialize, lifecycle, manager registration, workspace access). - test/providers/fixtures/cursor_cli_*.txt: 4 status fixtures (idle, completed, processing, permission). - docs/cursor-cli.md: full provider documentation. - README.md: new row in the provider table; quickstart and cross-provider sections updated to include cursor_cli. Quality gate: - black / isort: clean - mypy on new/changed files: 0 errors - pytest: 2372 passed, 0 failed (full test suite minus e2e/integration) - pytest test/providers/test_cursor_cli_unit.py: 53 passed - pytest test/providers/test_provider_manager_unit.py: 15 passed (no regression) Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> * test(cursor_cli): cover e2e examples/assign + dedupe extraction branch Addresses review comments on PR #296. Codecov (98.62%, 2 missing lines): - Removed the unreachable 'Incomplete Cursor CLI response - no separator before idle prompt' branch in extract_last_message_from_script. The early check 'if not separators or not idle_matches' already rejects the only case where this branch could fire (no separator at all before the trailing prompt). Replaced with an explicit assert end_sep is not None plus a comment explaining the invariant. Cover is now 100%. haofeif (verifies examples/assign e2e test passes with cursor_cli): - Added require_cursor fixture in test/e2e/conftest.py (matches the require_kimi / require_copilot pattern; checks both 'agent' and the legacy 'cursor-agent' alias). - Added e2e test classes for every flow: * TestCursorCliAssign (3 tests: data_analyst, report_generator, assign_with_callback) * TestCursorCliHandoff (2 tests: simple_function, second_task) * TestCursorCliSendMessage (1 test) * TestCursorCliAllowedTools (3 tests; restricted_supervisor marked xfail because cursor_cli uses soft enforcement via SECURITY_PROMPT — no native --disallowedTools equivalent) * TestCursorCliSupervisorOrchestration (3 tests including assign_three_analysts, the canonical examples/assign smoke test) - Added cursor_cli to the launch examples in examples/assign/README.md so users can run 'cao launch --agents analysis_supervisor --provider cursor_cli'. New unit test: - test_extracts_with_only_one_separator covers the start_sep=None fallback path in extract_last_message_from_script (one-separator buffers where the response-start separator has scrolled out of the 8KB rolling window). Files: examples/assign/README.md, src/cli_agent_orchestrator/providers/cursor_cli.py, test/e2e/{conftest,test_assign,test_handoff, test_send_message,test_allowed_tools, test_supervisor_orchestration}.py, test/providers/test_cursor_cli_unit.py Quality gate: - black / isort: clean (238 files) - mypy on new/changed files: 0 errors - pytest test/ --ignore=test/e2e -m 'not integration': 2373 passed, 0 failed (was 2372 before this commit) - pytest test/providers/test_cursor_cli_unit.py: 54 passed - pytest test/e2e/ --collect-only -m e2e: 80 tests collected (5 new cursor_cli test classes included) Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> * docs(cursor_cli): add e2e prerequisites, 11-test matrix, and smoke-test guide Исправлено с кило минимакс м3. Addresses the @haofeif review comment requesting that the new provider be validated with the examples/assign e2e workflow. The e2e test classes themselves were added in commit 8384c24; this commit expands docs/cursor-cli.md to give maintainers a complete how-to-run guide. The End-to-End Testing section now documents: - **Prerequisites**: Cursor CLI binary, CAO server, the three examples/assign/ profiles installed for the cursor_cli provider (with the --provider cursor_cli flag), and tmux. - **Pytest invocation**: the exact uv run pytest -m e2e ... -k cursor_cli -o "addopts=" form, with per-file invocations for handoff / assign / send_message / allowed_tools / supervisor_orchestration. Notes that the default pytest addopts excludes the e2e marker and the override is required. - **The 11 core e2e tests** (per skills/cao-provider/references/ lessons-learnt.md lesson #20, which lists the 11 minimum-success tests per provider). Each row links the test class + method to what it validates, and explicitly notes that test_restricted_supervisor_cannot_bash is marked xfail (soft enforcement via SECURITY_PROMPT — documented limitation). - **Manual examples/assign/ smoke test** as a quick interactive validation outside pytest, with the 'supervisor must NOT do the work itself' invariant called out and a pointer to lessons #19 and #16 for common failure modes. - **Troubleshooting entry #6** for the 'E2E tests skip with Cursor CLI not installed' auto-skip path (require_cursor fixture behaviour). The doc structure follows docs/claude-code.md as the reference template; section ordering, heading levels, and code-block formatting all match. Files: docs/cursor-cli.md Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> * fix(cursor_cli): strip full terminal escapes + restore q_cli/opencode_cli in README Исправлено с кило минимакс м3. Addresses the two Copilot review comments (submitted against commit 0f6f678 via the copilot-pull-request-reviewer[bot] account). **Copilot review #1 (src/cli_agent_orchestrator/providers/cursor_cli.py line 403 — extract_last_message_from_script escape handling):** Copilot flagged that the function operated on tmux 'capture-pane -e' output (escape sequences enabled) but only stripped SGR colour codes. Cursor CLI re-renders cursor-positioning and OSC sequences inside the response area, and these can leak into the extracted text or break the separator/prompt detection in get_status(). Fixes: - The separator regex in BOTH get_status() and extract_last_message_from_script() now tolerates any CSI sequence (not just SGR) interleaved between the box-drawing characters, using the strict ECMA-48 grammar (intermediate bytes 0x30-0x3F, final byte 0x40-0x7E). This prevents a stray 'ESC [' introducer from being consumed. - The response region in extract_last_message_from_script() is now stripped with a full-escape regex that handles CSI ('\x1b[...' final byte), OSC ('\x1b]...' BEL or ST), and 2-byte ESC sequences ('\x1b<intermediates><final>'). The docstring explicitly explains why we do NOT use the shared strip_terminal_escapes() helper (it normalises \r → \n, which would split single-line spinner frames into multiple lines — destructive for response extraction). **Copilot review #2 (README.md line 167 — Quick Start 'Valid:' list):** The Quick Start snippet's inline '# Valid:' provider list omitted 'q_cli' even though 'q_cli' is a supported provider everywhere else in the README and in ProviderType. The cross-provider section's bullet list was also missing 'opencode_cli'. Fixes: - README.md line 167: restored 'q_cli' to the inline list. - README.md line 256: added 'opencode_cli' to the cross-provider list (was already in the inline quickstart but missing here). **New unit tests:** - test_separator_matching_tolerates_interleaved_csi_escapes: exercises a separator line with SGR colour escapes between every box-drawing character (\x1b[38;5;245m────...\x1b[0m). Asserts both get_status() and extract_last_message_from_script() still find the boundary. - test_extraction_strips_cursor_positioning_sequences: injects \x1b[2K (erase line) and \x1b[H (cursor home) into the response region and verifies they are stripped from the result. - test_extraction_strips_osc_title_sequences: injects an OSC window-title update (\x1b]0;Cursor Agent\x07) and verifies it is stripped from the result. **Quality gate:** - black / isort: clean (238 files) - mypy src/cli_agent_orchestrator/providers/cursor_cli.py: Success: no issues found - pytest test/ --ignore=test/e2e -m 'not integration': 2376 passed, 0 failed (was 2373 before this commit) - pytest test/providers/test_cursor_cli_unit.py: 57 passed (was 54; +3 escape-handling tests) - All 7 commits in the branch force-pushed to feat/cursor-cli-provider on the fork Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> * fix(cursor_cli): address 6 Copilot review comments on PR #296 Исправлено с кило минимакс м3. Addresses all six inline review comments from the copilot-pull-request-reviewer[bot] review submitted at 2026-06-15T07:45:57Z against commit c5d1523 (PR #296). **1. IDLE_PROMPT_PATTERN must be start-of-line anchored (review #3411781807):** The previous pattern [❯>][\s\xa0] would match the leading '❯ ' on echoed user input lines (e.g. '❯ Summarize…') or any '> ' inside response content. Since get_status() returns COMPLETED whenever *any* match exists, this could misclassify status and could also make extract_last_message_from_script() anchor on the wrong 'idle' prompt. Fix: pattern is now ^\s*(?:\x1b\[[0-9;]*m)*[❯>](?:\x1b\[[0-9;]*m)*[\s\xa0] (start-of-line, optional leading whitespace, optional SGR colour codes before the prompt char, optional SGR codes after, then the '❯' or '>' and a single whitespace). MULTILINE flag is set on the finditer call so '^' matches at every line start, not just the buffer start. Matches the claude_code provider's _SOL_IDLE_RE pattern. **2. IDLE_PROMPT_PATTERN_LOG must be start-of-line anchored (review #3411781846):** Same fix applied to the log-file variant so the pre-check is consistent with live status detection. The log variant omits the SGR code allowances (logs are plain text) but retains the ^\s* start-of-line anchor. **3. initialize() must arm the StatusMonitor stickiness gate (review #3411781865):** initialize() now calls status_monitor.notify_input_sent() before send_keys so the next PROCESSING transition isn't suppressed when a ready status was previously latched. The import is lazy to break a circular import: status_monitor imports provider_manager which imports cursor_cli. **4. _build_cursor_command() must fall back to cursor-agent (review #3411781886):** The build path now uses shutil.which() to prefer the primary 'agent' binary and fall back to the legacy 'cursor-agent' alias when only that one is installed. Raises ProviderError with an install-from-URL message when neither is on $PATH. The e2e require_cursor fixture in test/e2e/conftest.py accepts either name, so the launch now behaves consistently. **5+6. Separator regex must tolerate CSI *between* dashes (reviews #3411781900 + #3411781914):** The previous regex (?:\x1b\[[\x30-\x3F]*[\x40-\x7E])*\u2500{20,} only allowed CSI sequences *before* the entire dash run — not between the dashes. The new pattern is:: ^(?:\x1b\[[\x30-\x3F]*[\x40-\x7E])?(?:\u2500(?:\x1b\[[\x30-\x3F]*[\x40-\x7E])?){20,}$ This is a repeated unit (─ + optional CSI) 20+ times. The optional CSI at the front handles Cursor's initial SGR colour setup. Intermediate bytes are restricted to the ECMA-48 param range (0x30-0x3F) so a stray 'ESC [' introducer is not consumed. The pattern is anchored to a full line so a stray dash sequence inside response content is not matched. The MULTILINE flag is required on finditer so '^' and '$' match at every line start/end. **New unit tests (11 added, 68 total in test_cursor_cli_unit.py):** - TestRegexPatterns::test_idle_prompt_is_start_of_line_anchored - TestRegexPatterns::test_idle_prompt_rejects_arrow_in_response_content - TestRegexPatterns::test_idle_prompt_log_is_start_of_line_anchored - TestSeparatorPattern::test_matches_plain_separator - TestSeparatorPattern::test_matches_csi_before_dash_run - TestSeparatorPattern::test_matches_csi_between_dashes - TestSeparatorPattern::test_does_not_match_dash_sequence_inside_content - TestBuildCommandBinaryResolution::test_prefers_agent_when_both_available - TestBuildCommandBinaryResolution::test_falls_back_to_cursor_agent_when_agent_missing - TestBuildCommandBinaryResolution::test_raises_when_neither_binary_installed - TestInitialize::test_initialize_arms_stickiness_gate Also added a module-level autouse _stub_cursor_binary fixture that patches shutil.which('agent') to return /usr/bin/agent for every test, so existing tests don't have to opt in to the binary-resolution mock. The 3 new TestBuildCommandBinaryResolution tests override this fixture to test the legacy-alias fallback and the missing-both error path. **Quality gate:** - black / isort: clean (238 files) - mypy src/cli_agent_orchestrator/providers/cursor_cli.py: Success - pytest test/ --ignore=test/e2e -m 'not integration': 2387 passed, 0 failed (was 2376 before this commit; +11 new tests) - pytest test/providers/test_cursor_cli_unit.py: 68 passed (was 57) - All 6 Copilot review comments addressed in this commit. Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> * fix(cursor_cli): drop --trust, expose cursor_cli in /agents/providers and UI Three small follow-ups uncovered by a real end-to-end test of the Cursor CLI provider on Cursor CLI v2026.06.15 (the version installed via `curl https://cursor.com/install | bash`): 1. `cursor_cli` was missing from the `provider_binaries` dict in `/agents/providers` (`api/main.py`), so the web UI's provider dropdown never advertised it as installed, even with the binary on PATH. Added `"cursor_cli": "agent"` to match the resolution logic in `_build_cursor_command`. 2. `FALLBACK_PROVIDERS` in `web/src/components/AgentPanel.tsx` (the list shown when the API doesn't return providers, e.g. before the server is queried) was also missing `cursor_cli`. Added it. 3. `--trust` is rejected by Cursor CLI v2026.06.15 in interactive REPL mode (`Error: --trust can only be used with --print/headless mode`) and caused the launch to fail with a 500. Dropped it: the CAO launch flow already confirms workspace trust, and the interactive REPL doesn't have a per-directory trust dialog that `--trust` would skip. `--force` is still passed so per-tool approvals don't block. All 68 unit tests in `test/providers/test_cursor_cli_unit.py` and the web UI tests still pass. * fix(cursor_cli): add v2026+ TUI-placeholder status detection (issue #299) Cursor CLI v2026.x runs as a full Ink/TUI in interactive mode. The `❯` prompt and the `─────` separator that older text-mode builds emitted into the pipe-pane buffer are now TUI widgets and never reach the FIFO; the regex suite in the original provider (matching those markers) returns UNKNOWN forever, and `wait_until_status(... IDLE, COMPLETED)` times out after 30 seconds — the same 500 the issue reports. The only stable plain-text signal the v2026 TUI emits is the input-box placeholder "Plan, search, build anything". Cursor REPLACES it with the user's text on submit and only redraws it once the response is fully delivered, so: * present in the tail of the rolling buffer → IDLE / COMPLETED * absent (replaced by the user's text) → PROCESSING This commit: * adds `TUI_PLACEHOLDER_PATTERN` and `TUI_STATUS_BAR_PATTERN` constants and consults the last 1KB of the cleaned buffer for the placeholder before falling through to the existing separator-based regex suite (which still classifies older text-mode Cursor builds correctly); * records a real v2026.06.15 idle fixture (cursor_cli_v2026_idle_output.txt) captured via tmux pipe-pane + cat, and a synthetic v2026 processing fixture (placeholder replaced by user text); * adds a `TestGetStatusV2026Tui` test class with 8 tests that cover the placeholder present/absent cases, the TUI TAIL WINDOW contract (long-response eviction does not flip back to IDLE), and the status-bar guard so a half-initialised TUI does not false-positive; * drops `--trust` from the launch command — v2026 rejects `--trust` in interactive REPL mode ("only works with --print/headless mode"), the CAO launch flow already confirms workspace trust, and the interactive REPL has no per-directory trust dialog for the flag to skip anyway. `--force` is still passed so per-tool approvals do not block. All 75 unit tests in test/providers/test_cursor_cli_unit.py pass (7 new ones). The v2026 fixtures and the placeholder detection are isolated from the original regex suite, so older text-mode builds keep being classified the same way as before. End-to-end validation of the full launch against v2026 is BLOCKED by a separate, deeper issue uncovered while running this patch: v2026 has no `--agent` flag, so the provider's command exits immediately with "error: unknown option '--agent'" before the TUI is ever rendered. Tracked separately. The TUI detection in this commit is correct in isolation and will be needed once the launch command is fixed. * fix(cursor_cli): rebuild launch command for Cursor CLI v2026 (issue #300) Cursor CLI v2026.06.15 dropped two flags the original provider relied on: * `--agent <name>` (rejected with "error: unknown option '--agent'") * `--mcp <json>` (rejected with "error: unknown option '--mcp'") It also changed the semantics of an existing flag: * `--system-prompt` now takes a *file path* rather than inline text ("Error: failed to read --system-prompt file: <text>" when given inline text) The v2026 equivalents / replacements are: * `--agent <name>` -> none. The CAO agent profile body is carried in the `--system-prompt` file instead, so multi-agent orchestration (handoff / assign) still works. * `--mcp <json>` -> `--plugin-dir <path>` pointing at a directory holding a Cursor plugin manifest. We synthesise the directory at build time, materialising the profile's mcpServers map into the manifest's `mcpServers` field and forwarding `CAO_TERMINAL_ID` so MCP tools can resolve the current terminal. `--approve-mcps` is still passed to skip per-server approval dialogs. * `--system-prompt` -> writes the prompt to a per-session file under `~/.aws/cli-agent-orchestrator/tmp/<tid>-system-prompt.md` and passes the path. All 75 unit tests pass. End-to-end validated on this Codespaces: $ curl -X POST .../sessions?provider=cursor_cli&agent_profile=developer HTTP 201 in 7.5s Status changes: unknown -> completed (TUI placeholder detection) $ curl -X POST .../sessions?provider=cursor_cli&agent_profile=data_analyst HTTP 201 in 7.6s Status: idle Both sessions render the v2026 TUI correctly and the StatusMonitor latches the placeholder-driven IDLE/COMPLETED state, so the TUI-detection patch from 9502dd1 (#299) and the launch-command rework in this commit are now both end-to-end functional. Closes #300. * fix(api): honor `*` wildcard in WS_ALLOWED_CLIENTS The WebSocket terminal viewer was rejecting browser connections from any IP that wasn't in the literal allowlist. Operators running cao-server inside a container (Codespaces / devcontainers / remote hosts) could pass `CAO_WS_ALLOWED_CLIENTS="*"` to mean "any client", but the check was an exact `in` comparison so the literal string `"*"` never matched a real client IP and the handler always closed the connection with code 4003. Treat a literal `*` entry in `WS_ALLOWED_CLIENTS` as a wildcard that disables the IP check, matching the same opt-in semantics operators expect for `CAO_ALLOWED_HOSTS`. Container / Codespaces setups that pass `CAO_WS_ALLOWED_CLIENTS="*"` will now accept WS connections from the browser without enumerating the tunnel IP ahead of time. Security note: the WebSocket endpoint exposes unauthenticated PTY access and is intended for localhost-only use; setting `CAO_WS_ALLOWED_CLIENTS="*"` together with `--host 0.0.0.0` on a host reachable from the open internet is a real risk and should be paired with a reverse proxy that enforces auth (the existing comment at the top of the WS handler still applies). * fix(api): enable uvicorn proxy_headers for WS over HTTPS tunnels Codespaces / devcontainers / reverse-proxy setups (anything that terminates TLS in front of cao-server and forwards plain HTTP) need uvicorn to honour X-Forwarded-Proto / X-Forwarded-For. Without `proxy_headers=True`, uvicorn sees the raw HTTP request and the browser's WSS upgrade through the HTTPS tunnel is rejected — the WebSocket terminal viewer closes immediately with no useful diagnostic on the client side. `forwarded_allow_ips="*"` trusts the X-Forwarded-* headers from any upstream. Combined with `CAO_ALLOWED_HOSTS="*"` and `CAO_WS_ALLOWED_CLIENTS="*"` (now wildcard-aware after the previous fix) this is the standard Codespaces setup. Security: the WS endpoint still exposes unauthenticated PTY access; operators fronting cao-server with a reverse proxy that enforces auth should narrow `forwarded_allow_ips` to the proxy's IP range instead of leaving it on the wildcard. * docs(codespaces): add Codespaces setup and troubleshooting guide Add docs/codespaces.md covering server start command, the four CAO_* env vars, port 9889 forwarding, local verification, and a 404 troubleshooting table. Link it from CONTRIBUTING.md and DEVELOPMENT.md. * fix(cursor_cli): drop --system-prompt entirely for v2026.06.15 Cursor CLI v2026.06.15's backend (https://agentn.global.api5.cursor.sh) rejects every request that carries a `--system-prompt <file>` payload with `[invalid_argument] unknown option '--system-prompt'`. The bug is reproducible regardless of file contents — a 3-character file triggers the same error as a 4.5KB system prompt. Cursor's own debug log at /tmp/cursor-agent-logs/session-*.log shows the ConnectError firing on the very first request and all retries failing the same way. Cursor's log also shows the TUI retrying the request 3 times before giving up, which is what was rendering as `Reconnecting (attempt N, Ns)` in the web UI. This commit removes the --system-prompt flag from the launch command entirely. Multi-turn inbox still works because: * the CAO role / system prompt reaches the agent through the cao-mcp-server MCP tool's handoff / assign payloads (on the wire, not via Cursor's launch arguments); * the agent still has the @cao-mcp-server tool set loaded via --plugin-dir, so assign / handoff / send_message all work; * soft tool-restriction enforcement (SECURITY_PROMPT) is no longer available on the launch line — documented in the docstring; needs a different enforcement path if a per-profile restricted mode is required. The _write_system_prompt_file helper is preserved (gated by `if False and ...`) so a future Cursor point release that fixes the backend can re-enable the flag with a single-line change. End-to-end validated in this Codespaces: $ curl -X POST .../sessions?provider=cursor_cli HTTP 201 in 8.5s status: unknown -> completed -> processing $ curl -X POST .../terminals/<id>/input?message=hi%20there HTTP 200 in 0.4s Cursor log: [nal_agent_retries] Request successful Cursor log: outcome=success TUI status bar: "Composer 2.5 Fast 6.4%" (response streaming) 74/74 unit tests pass (one fewer than before — the test_skill_prompt_appended test was redundant with the new test_agent_profile_loaded_but_not_passed_as_flag case). * fix(cursor_cli): use 'ctrl+c to stop' as v2026+ processing signal (issue #299 follow-up) The previous TUI detection landed in 9502dd1 used the input-box placeholder ("Plan, search, build anything" / "Add a follow-up") as the idle / processing signal. Live testing on v2026.06.15 showed that the placeholder is ALWAYS present regardless of agent state — it is the *input box's empty state*, not a "ready for next turn" indicator. The previous detector therefore classified every post-launch TUI frame as PROCESSING (placeholder absent in the 1KB tail after the user submits), and never transitioned back to COMPLETED once the agent finished a turn. The correct v2026+ signal is the "ctrl+c to stop" hint Cursor renders on the same line as the placeholder every frame the agent is actively working on a turn. The hint disappears once the response is fully delivered and the input box is back to the placeholder alone. The hint is rendered in the last few hundred bytes of every Cursor TUI frame, so the same 1KB TUI TAIL WINDOW the previous patch used is still the right scope. Updated get_status() so the primary v2026+ PROCESSING check is "ctrl+c to stop" present in the tail (replacing the previous "placeholder absent in the tail" heuristic). The IDLE / COMPLETED check no longer requires the placeholder in the tail — it is the *absence* of the processing indicator, paired with the status bar being visible, that signals a turn has finished. TUI_PLACEHOLDER_PATTERN now matches BOTH placeholder strings Cursor v2026 uses ("Plan, search, build anything" on a fresh launch, "Add a follow-up" after the first turn) so the fixture-based unit tests cover both conversation states. A new TUI_PROCESSING_INDICATOR_PATTERN ("ctrl+c to stop") is the new primary signal. Both are imported by the test module and covered by test_processing_indicator_pattern_documented. Live validation in this Codespaces (agent CLI v2026.06.15): $ curl -X POST .../sessions?provider=cursor_cli HTTP 201 in 7.5s status: completed (TUI idle, no indicator) $ curl -X POST .../terminals/<id>/input?message=hi%20again T+1s: processing (ctrl+c to stop indicator visible) T+3s: processing T+5s: processing T+8s: completed (indicator gone, response delivered) T+10s+: completed (stable, no false positives) 77/77 unit tests pass (added 2 tests: test_processing_indicator_in_tail_returns_processing and test_processing_indicator_pattern_documented, plus a new post-turn-idle fixture cursor_cli_v2026_post_turn_idle_output.txt that captures the live 'Add a follow-up' placeholder text and the absence of the 'ctrl+c to stop' indicator). * fix(cursor_cli): address haofeif review + clean up v2026 follow-ups Implements the five action items from the human review on #296 plus the OUTDATED Copilot review threads from the v2026 follow-up commits. Action items from haofeif's review: 1. Make forwarded_allow_ips configurable. Replaces the hard-coded forwarded_allow_ips="*" (which trusts X-Forwarded-* from any upstream) with a new TRUSTED_FORWARDER_IPS constant that defaults to ["127.0.0.1", "::1"] and is extended by CAO_FORWARDED_ALLOW_IPS (comma-separated). A literal "*" is still honoured as a disable-the-check opt-in (matches the existing CAO_WS_ALLOWED_CLIENTS="*" semantics), so Codespaces users with no other option get the same behaviour as before. The default now matches the conservative CAO_WS_ALLOWED_CLIENTS default and is safe for bare cao-server --host 127.0.0.1 deployments. 2. Run black. Re-formats cursor_cli.py, test_cursor_cli_unit.py, api/main.py, and constants.py to match the project's [tool.black] config (line-length 100, target-version py310). CI's black check should now pass. 3. Implement temp file cleanup. cleanup() now removes every per-session temp file the provider created: <CAO_TMP_DIR>/<tid>-system-prompt.md and <CAO_TMP_DIR>/<tid>-cursor-plugins/ (including the plugin.json manifest inside the plugin dir). The paths are tracked in self._tmp_paths as the helpers create them, and cleanup() walks the registry, calls shutil.rmtree on directories / Path.unlink on files, swallows transient OSError (logged at WARNING), and drains the registry so a second cleanup is a safe no-op. 4. Remove dead code. Drops the if False and profile is not None block that preserved the v2026-disabled --system-prompt injection path. The _write_system_prompt_file helper is still available for a future Cursor point release; the launch command does not call it. Also removed a duplicate get_status body that had been left over from a merge. 5. Address binary resolution. The provider now prefers the unambiguous cursor-agent alias first (only the Cursor CLI ships it) and falls back to the documented primary agent name. When agent is selected the provider runs an agent --version probe and validates the banner looks like "agent <4-digit-year>.<...>" - gpg-agent and other unrelated agent-named tools on the host PATH no longer get launched with Cursor-only flags. Failed probes / unknown banners raise ProviderError with a clear "uninstall or symlink to cursor-agent" message. Docs + module docstring: - The module docstring was describing flags the provider no longer uses (--system-prompt, --agent, --mcp, --trust). Rewritten to match the v2026 launch command, list the deliberately-omitted flags, and explain the rationale (issue #299 / #300). - docs/cursor-cli.md updated for status detection, permission bypass, agent profile integration, launch command, tool restrictions, and troubleshooting. Test coverage added (82/82 unit tests pass): - test_agent_validation_passes_for_cursor_binary - test_agent_validation_rejects_non_cursor_binary - test_agent_validation_handles_probe_timeout - test_cursor_agent_skips_validation - test_prefers_cursor_agent_when_both_available - test_cleanup_removes_tracked_tmp_paths Fixture fix: - cursor_cli_v2026_processing_output.txt had lost the actual \x1b (ESC) bytes before CSI sequences. Rebuilt with real escape bytes so the fixture exercises the same escape-stripping code path the live v2026 TUI produces. * fix(cursor_cli): split IDLE / COMPLETED on the turn counter The provider used to report COMPLETED for both a fresh spawn (no user input yet) and a finished turn (response delivered, ready for the next prompt). The supervisor inbox and the StatusMonitor's stickiness gate treat both as "ready" so functionally nothing broke, but the UI badge showed the wrong label right after Spawn Agent - the user-visible status was "completed" for a terminal that had not yet received a single message. The split: IDLE = fresh spawn, never received a turn. COMPLETED = at least one turn has been delivered, the agent is back to a non-processing state. Cursor CLI v2026's TUI looks the same in both states (placeholder visible, status bar visible, no "ctrl+c to stop" hint), so the buffer alone cannot distinguish them. The provider now tracks a turn counter that ``mark_input_received`` (the hook the terminal service calls after every ``send_input``) increments. ``get_status`` returns IDLE while the counter is zero and COMPLETED once at least one turn has been delivered. Why not invent a buffer signal? The placeholder text swaps from "Plan, search, build anything" (fresh launch) to "Add a follow-up" (after the first turn), so in principle the placeholder text is a discriminator. But by the time the placeholder has been swapped the first turn has already been delivered - the swap happens on the agent's first input, not on a fresh launch. Using the counter is robust and does not depend on a brittle TUI signal that could change in a future v2026 point release. Implementation: - ``__init__`` initialises ``self._turns: int = 0``. - New ``mark_input_received()`` override increments the counter; called by the terminal service on every input delivery. - The two IDLE / COMPLETED branches in ``get_status`` now return ``COMPLETED if self._turns > 0 else IDLE``. Test updates: - Every test that previously asserted COMPLETED now calls ``provider.mark_input_received()`` first to simulate the post-turn state. - New ``test_idle_fixture_without_input_returns_idle`` and ``test_v2026_idle_fixture_fresh_spawn_returns_idle`` assert the IDLE label for the fresh-spawn state on both legacy and v2026 fixtures. Live validation in this Codespaces: \$ curl -X POST .../sessions?provider=cursor_cli HTTP 201, 7.5s status: idle (T+0s) \$ curl -X POST .../terminals/<id>/input?message=hi status: processing (T+1s-5s) status: completed (T+8s+) \$ curl -X POST .../terminals/<id>/input?message=how are you status: processing (T+1s-8s) status: completed (T+12s+) 84/84 unit tests pass. * fix(tests): address CI failures for #296 - isort: fix import order in test_cursor_cli_unit.py - test_list_providers_all_installed: bump to 10 providers and assert cursor_cli - test_main_custom_host_port / test_main_extends_cors_for_custom_host_port: account for new proxy_headers/forwarded_allow_ips kwargs added to uvicorn.run in 4a82417 (uvicorn proxy_headers for WS over HTTPS tunnels, #149) Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> --------- Co-authored-by: ThePlenkov <6381507+ThePlenkov@users.noreply.github.com> Co-authored-by: kiloconnect[bot] <240665456+kiloconnect[bot]@users.noreply.github.com> Co-authored-by: Kilo <kilo@local>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds complete test coverage for the Q CLI provider with 51 tests achieving 100% code coverage.
What's Included
Running Tests