Skip to content

Epic: Address permissions fatigue — cc-toolgate evolution + LLM-as-judge fallback #52

@butterflysky-ai

Description

@butterflysky-ai

Problem

Permission prompts create friction in autonomous workflows. cc-toolgate handles rule-based command evaluation well, but ambiguous cases still gate on human approval. As agents operate more autonomously (overnight builds, background chains, multi-project work), the permission bottleneck becomes the primary constraint on throughput.

Two-track approach

Track 1: cc-toolgate improvements

Continue expanding the rule-based evaluator — more subcommand patterns, better config overlay, smarter heuristics. This is the deterministic layer.

Related: butterflyskies/cc-toolgate open issues (#6, butterflyskies/tasks#9, butterflyskies/tasks#14)

Track 2: LLM-as-judge fallback

When cc-toolgate can't make a deterministic decision, fall back to a small LLM evaluating the tool call in context. The judge sees:

  • The command being evaluated
  • The working directory and project context
  • Recent conversation history (what was the agent trying to do?)
  • The cc-toolgate config (what rules exist?)

Architecture options:

  • PreToolUse hook → LLM call: cc-toolgate shells out to a local model or MCP-connected judge agent
  • Interconnected agents over MCP bridge: a dedicated judge agent running as an MCP server that cc-toolgate queries
  • Message queue: cc-toolgate publishes evaluation requests, judge agent consumes and responds

The judge doesn't replace cc-toolgate — it handles the cases cc-toolgate returns "ask". The deterministic rules are always checked first.

Success criteria

  • Overnight autonomous work sessions don't stall on permission prompts
  • Dangerous commands still gate (the judge is conservative by default)
  • Audit trail: every judge decision is logged with reasoning
  • False positive rate (blocking safe commands) drops without increasing false negatives (allowing dangerous ones)

Related

  • cc-toolgate butterflyskies/tasks#14 (config-driven command evaluation — step toward declarative rules)
  • Lina's shower thought: LLM judge as PreToolUse hook
  • The --dangerously-skip-permissions flag on itzpapalotl is the current workaround

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions