Skill auto-triggering: recall=0% in headless mode (claude -p) regardless of description content

## Summary

Custom skills never auto-trigger when invoked via `claude -p` (non-interactive/headless mode), regardless of how explicit or directive the skill description is. Precision is 100% (no false positives), but recall is persistently 0% across multiple description iterations.

## Environment

- **OS**: Windows 11 Home 10.0.26200
- **Claude plan**: Claude Pro (OAuth — no `ANTHROPIC_API_KEY`)
- **Claude Code version**: latest (VS Code extension + CLI)
- **skill-creator**: marketplace version (Skills 2.0)
- **Skill location**: `C:\Users\<user>\.claude\skills\<skill-name>\SKILL.md`

## Steps to Reproduce

1. Create a custom skill in `~/.claude/skills/<skill>/SKILL.md` with a description following Skills 2.0 format
2. Create a trigger eval set (`trigger-eval.json`) with 20 queries (10 should-trigger, 10 should-not)
3. Run `run_eval.py` from `skill-creator`:
   ```bash
   python -m scripts.run_eval \
     --eval-set trigger-eval.json \
     --skill-path ~/.claude/skills/<skill> \
     --runs-per-query 3 --verbose
   ```
4. Observe results

## Observed Behavior

All should-trigger queries score `trigger_rate: 0.0` — the skill never auto-triggers in any of the `claude -p` subprocess calls spawned by `run_eval.py`.

Example output:
```
[FAIL] rate=0/3 expected=True: can you search youtube for claude code tutorials?
[FAIL] rate=0/3 expected=True: find me 5 youtube videos about python async programming
[FAIL] rate=0/3 expected=True: /yt-search IfcOpenShell tutorial
[PASS] rate=0/3 expected=False: how do i download a youtube video to watch offline?
[PASS] rate=0/3 expected=False: search google for recent news about llm agents

Results: 10/20 passed — precision=100%, recall=0%, accuracy=50%
```

## Description Tested (v2 — very explicit)

```yaml
description: >
  Fetches live YouTube search results via yt-dlp — real titles, channels,
  view counts, upload dates, and URLs that Claude cannot know from training data.
  ALWAYS use this skill whenever the user wants to search YouTube or find YouTube
  videos on any topic. Never answer YouTube search requests from memory; only this
  skill can provide current, accurate video results.
  Trigger on any request containing: "search youtube", "find youtube videos",
  "look up youtube tutorials", "youtube results", "yt-search", "/yt-search",
  "search yt", "pull up youtube videos", "what youtube videos", "find me videos about".
  Do NOT trigger for: summarizing a specific video URL, downloading videos, or
  general web searches with no YouTube intent.
```

Even with `ALWAYS`, explicit keyword lists, and negative examples, recall remains 0%.

## Expected Behavior

Skills with highly directive descriptions matching the query content should auto-trigger with meaningful recall (>50%) in headless mode.

## Additional Notes

- The skill works **perfectly** when invoked explicitly via `/yt-search <query>` slash command
- `run_loop.py` (the automated optimization loop) is **not usable** for Claude Pro users — it calls `anthropic.Anthropic()` SDK directly and requires `ANTHROPIC_API_KEY`, which is not available with OAuth authentication
- The `run_eval.py` script required a Windows-specific fix: `select.select()` does not work with subprocess pipes on Windows (`WinError 10038`), replaced with a `queue.Queue` + background thread pattern
- Tested across 2 full description iterations with no improvement in recall

## Workaround

None for auto-triggering. Users must invoke skills explicitly via slash commands.

## Suggested Investigation

- Does `claude -p` pass available skills to the model differently than interactive mode?
- Is there a flag or env variable to enable skill auto-triggering in headless mode?
- Should `run_eval.py` work differently for Pro/OAuth users (no API key)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skill auto-triggering: recall=0% in headless mode (claude -p) regardless of description content #32184

Summary

Environment

Steps to Reproduce

Observed Behavior

Description Tested (v2 — very explicit)

Expected Behavior

Additional Notes

Workaround

Suggested Investigation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Skill auto-triggering: recall=0% in headless mode (claude -p) regardless of description content #32184

Description

Summary

Environment

Steps to Reproduce

Observed Behavior

Description Tested (v2 — very explicit)

Expected Behavior

Additional Notes

Workaround

Suggested Investigation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions