Problem
Permission prompts create friction in autonomous workflows. cc-toolgate handles rule-based command evaluation well, but ambiguous cases still gate on human approval. As agents operate more autonomously (overnight builds, background chains, multi-project work), the permission bottleneck becomes the primary constraint on throughput.
Two-track approach
Track 1: cc-toolgate improvements
Continue expanding the rule-based evaluator — more subcommand patterns, better config overlay, smarter heuristics. This is the deterministic layer.
Related: butterflyskies/cc-toolgate open issues (#6, butterflyskies/tasks#9, butterflyskies/tasks#14)
Track 2: LLM-as-judge fallback
When cc-toolgate can't make a deterministic decision, fall back to a small LLM evaluating the tool call in context. The judge sees:
- The command being evaluated
- The working directory and project context
- Recent conversation history (what was the agent trying to do?)
- The cc-toolgate config (what rules exist?)
Architecture options:
- PreToolUse hook → LLM call: cc-toolgate shells out to a local model or MCP-connected judge agent
- Interconnected agents over MCP bridge: a dedicated judge agent running as an MCP server that cc-toolgate queries
- Message queue: cc-toolgate publishes evaluation requests, judge agent consumes and responds
The judge doesn't replace cc-toolgate — it handles the cases cc-toolgate returns "ask". The deterministic rules are always checked first.
Success criteria
- Overnight autonomous work sessions don't stall on permission prompts
- Dangerous commands still gate (the judge is conservative by default)
- Audit trail: every judge decision is logged with reasoning
- False positive rate (blocking safe commands) drops without increasing false negatives (allowing dangerous ones)
Related
- cc-toolgate butterflyskies/tasks#14 (config-driven command evaluation — step toward declarative rules)
- Lina's shower thought: LLM judge as PreToolUse hook
- The
--dangerously-skip-permissions flag on itzpapalotl is the current workaround
🤖 Generated with Claude Code
Problem
Permission prompts create friction in autonomous workflows. cc-toolgate handles rule-based command evaluation well, but ambiguous cases still gate on human approval. As agents operate more autonomously (overnight builds, background chains, multi-project work), the permission bottleneck becomes the primary constraint on throughput.
Two-track approach
Track 1: cc-toolgate improvements
Continue expanding the rule-based evaluator — more subcommand patterns, better config overlay, smarter heuristics. This is the deterministic layer.
Related: butterflyskies/cc-toolgate open issues (#6, butterflyskies/tasks#9, butterflyskies/tasks#14)
Track 2: LLM-as-judge fallback
When cc-toolgate can't make a deterministic decision, fall back to a small LLM evaluating the tool call in context. The judge sees:
Architecture options:
The judge doesn't replace cc-toolgate — it handles the cases cc-toolgate returns "ask". The deterministic rules are always checked first.
Success criteria
Related
--dangerously-skip-permissionsflag on itzpapalotl is the current workaround🤖 Generated with Claude Code