| System | Category | Review Depth | Why It Matters | Status |
|---|---|---|---|---|
| Roo Code | Core reference | Deep | Strong runtime and UI state exposure | Reviewed v1 |
| Cline | Core reference | Deep | Clear execution loop and completion semantics | Reviewed v1 |
| Claude Code | Core reference | Deep | Strong policy, hooks, and composition surface | Reviewed v1 |
| GitHub Copilot | Comparator | Medium | Product-shell convergence and coding-agent UX pressure | Seeded |
| Cursor | Comparator | Medium | Strong market reference even with partial implementation visibility | Seeded |
| Kilo Code | Comparator | Medium | Additional open-source coding-agent comparison point | Seeded |
| OpenCode | Comparator | Medium | Useful for alternative runtime and UX choices | Seeded |
| Continue | Comparator | Medium | Extension-based agent tooling and workflow patterns | Planned |
| Aider | Comparator | Medium | File-edit and command-loop baseline | Planned |
| OpenHands | Comparator | Medium | Agent task execution with broader action surfaces | Seeded |
| Vercel AI SDK | Runtime neighbor | Medium | Defines low-level tool and streaming primitives | Planned |
| LangGraph | Runtime neighbor | Medium | Clarifies boundary with graph-based orchestration | Planned |
| Mastra | Runtime neighbor | Light | Comparator for orchestration-layer choices | Planned |
Tier 1 systems are reviewed first because they expose the clearest evidence for runtime semantics. Tier 2 systems help separate durable patterns from product packaging. Tier 3 systems help define the architectural boundary between the proposed runtime layer and neighboring infrastructure.
| Axis | Roo Code | Cline | Claude Code | Copilot | Cursor | OpenHands |
|---|---|---|---|---|---|---|
| Explicit completion action | Strong, explicit attempt_completion tool |
Strong, explicit attempt_completion tool |
Medium, more hook-mediated than tool-mediated in public docs | Observable but product-mediated | Observable but implicit | Mixed |
| Completion guardrails | Strong, blocked by failed tools and open todos | Strong, double-check and feedback reinjection | Strong, Stop and TaskCompleted hooks can block closure |
Medium signal | Medium signal | Mixed |
| Recoverable tool failures | Strong, errors stay in task flow | Strong, toolError(...) stays in-band |
Strong, PostToolUseFailure is a first-class lifecycle event |
Medium | Medium | Strong |
| Separate runtime state | Strong, explicit agent loop state and required action | Medium, visible through message and handler structure | Medium, state visible through lifecycle and permission model | Strong in product UX | Strong in product UX | Medium |
| Context compaction | Strong, threshold-based context management | Strong, proactive compaction in subagent loop | Strong, pre/post compact lifecycle and auto-compaction docs | Medium | Medium | Medium |
| Subagents | Medium, delegated tasks and parent reopen flow | Strong, bounded runners with aggregated status | Strong, bounded tools, max turns, scope, isolation, resume | Medium | Medium | Mixed |
| Hooks or policy interception | Medium, approval and task events | Strong, pre/post/task-complete hooks | Strong, dedicated hook system | Medium | Medium | Medium |
| Permission model | Strong | Strong | Strong | Strong | Strong | Medium |
| Command output streaming | Strong | Strong | Strong | Strong | Strong | Medium |
The Tier 1 rows above are now source-backed at a first-pass level. Comparator rows remain provisional until those systems receive the same treatment.
- The matrix tracks behavior, not feature marketing.
- "Reviewed v1" means a first source-backed review pass exists, but the document still needs deeper file-by-file evidence collection.
- "Seeded" means a first review stub exists and the system is part of active analysis.
- "Planned" means it is in scope but not yet meaningfully analyzed.