|
| 1 | +--- |
| 2 | +name: analyze-github-action-logs |
| 3 | +description: Analyze recent GitHub Actions workflow runs to identify patterns, mistakes, and improvements. Use when asked to "analyze workflow logs", "review action runs", or "analyze GitHub Actions". |
| 4 | +compatibility: Requires gh CLI and access to the GitHub repository. |
| 5 | +--- |
| 6 | + |
| 7 | +# Analyze GitHub Action Logs |
| 8 | + |
| 9 | +Fetch and analyze recent GitHub Actions runs for a given workflow. Review agent/step performance, identify wasted effort and mistakes, and produce a report with actionable improvements. |
| 10 | + |
| 11 | +## Input |
| 12 | + |
| 13 | +You need: |
| 14 | + |
| 15 | +- **`workflow`** (required) — The workflow file name or ID (e.g., `issue-triage.yml`, `deploy.yml`). |
| 16 | +- **`repo`** (optional) — The GitHub repository in `OWNER/REPO` format. Defaults to `withastro/astro`. |
| 17 | +- **`count`** (optional) — Number of recent completed runs to analyze. Defaults to `5`. |
| 18 | + |
| 19 | +## Step 1: List Recent Runs |
| 20 | + |
| 21 | +Fetch the most recent completed runs for the workflow. Filter by `--status=completed`: |
| 22 | + |
| 23 | +```bash |
| 24 | +gh run list --workflow=<workflow> -R <repo> --status=completed -L <count> |
| 25 | +``` |
| 26 | + |
| 27 | +Present the list to orient yourself: run IDs, titles, status (success/failure), and duration. Pick the runs to analyze — prefer a mix of successes and failures if available, and prefer runs that exercised more steps (longer runs tend to go through more stages, while shorter runs may exit early). |
| 28 | + |
| 29 | +## Step 2: Fetch Logs |
| 30 | + |
| 31 | +For each run you want to analyze, save the full log to a temp file: |
| 32 | + |
| 33 | +```bash |
| 34 | +gh run view <run_id> -R <repo> --log > /tmp/actions-run-<run_id>.log |
| 35 | +``` |
| 36 | + |
| 37 | +## Step 3: Identify Step/Skill Boundaries |
| 38 | + |
| 39 | +Search each log file for markers that indicate where each step or skill starts and ends. The markers depend on the workflow — look for patterns like: |
| 40 | + |
| 41 | +- **Flue skill markers**: `[flue] skill("..."): starting` / `completed` |
| 42 | +- **GitHub Actions step markers**: Step name headers in the log output |
| 43 | +- **Custom markers**: Any `START`/`END` or similar delimiters the workflow uses |
| 44 | + |
| 45 | +```bash |
| 46 | +grep -n "skill(\|step\|START\|END\|starting\|completed" /tmp/actions-run-<run_id>.log | head -50 |
| 47 | +``` |
| 48 | + |
| 49 | +From this, determine which line ranges correspond to each step/skill. Also find any result markers: |
| 50 | + |
| 51 | +```bash |
| 52 | +grep -n "RESULT_START\|RESULT_END\|extractResult" /tmp/actions-run-<run_id>.log |
| 53 | +``` |
| 54 | + |
| 55 | +Note: Some log files may contain binary/null bytes. Use `grep -a` if needed. |
| 56 | + |
| 57 | +## Step 4: Analyze Each Step (Use Subagents) |
| 58 | + |
| 59 | +For each step/skill that ran, **launch a subagent** to analyze that section's log. This is critical to avoid polluting your context with thousands of log lines. |
| 60 | + |
| 61 | +For each subagent, provide: |
| 62 | + |
| 63 | +1. The log file path and the line range for that step |
| 64 | +2. If skill instruction files exist for the workflow, tell the subagent to read them first for context |
| 65 | +3. The run title/context so the subagent understands what was being done |
| 66 | +4. The analysis criteria below |
| 67 | + |
| 68 | +### Analysis Criteria |
| 69 | + |
| 70 | +Tell each subagent to evaluate: |
| 71 | + |
| 72 | +1. **Correctness** — Was the step's final result/verdict correct? |
| 73 | +2. **Efficiency** — How long did it take? What's a reasonable baseline? Where was time wasted? |
| 74 | +3. **Mistakes** — Wrong tool calls, failed commands retried without changes, unnecessary rebuilds, etc. |
| 75 | +4. **Instruction compliance** — If skill instructions exist, did the agent follow them? Where did it deviate? |
| 76 | +5. **Scope creep** — Did the agent do work that belongs in a different step? |
| 77 | +6. **Suggestions** — Specific, actionable changes that would prevent the issues found. |
| 78 | + |
| 79 | +Tell each subagent to return a structured response with: Summary, Time Analysis, Issues Found (with estimated time wasted for each), and Suggestions for Improvement. |
| 80 | + |
| 81 | +## Step 5: Consolidate Report |
| 82 | + |
| 83 | +After all subagents return, synthesize their findings into a single report. Structure it as: |
| 84 | + |
| 85 | +### Per-Run Summary Table |
| 86 | + |
| 87 | +For each run analyzed, include a table: |
| 88 | + |
| 89 | +| Step/Skill | Time | Result | Time Wasted | Top Issue | |
| 90 | +| ---------- | ---- | ------ | ----------- | --------- | |
| 91 | + |
| 92 | +### Cross-Cutting Patterns |
| 93 | + |
| 94 | +Identify issues that appeared across multiple runs or multiple steps. These are the highest-value improvements. Common patterns to look for: |
| 95 | + |
| 96 | +- **TodoWrite abuse** — Agent wasting time on task list management during automated runs |
| 97 | +- **Server management failures** — Port conflicts, failed process kills, stale log files |
| 98 | +- **Tool misuse** — Using `curl` instead of `gh`, `jq` not found, etc. |
| 99 | +- **Scope creep** — One step doing work that belongs in another |
| 100 | +- **Unnecessary rebuilds** — Building packages multiple times without changes |
| 101 | +- **Test timeouts** — Running slow E2E/Playwright tests that time out |
| 102 | +- **Instruction violations** — Agent doing something the instructions explicitly forbid |
| 103 | +- **Redundant work** — Re-reading files, re-running searches, re-installing dependencies |
| 104 | + |
| 105 | +### Prioritized Recommendations |
| 106 | + |
| 107 | +Rank your improvement suggestions by estimated time savings across all runs. For each recommendation: |
| 108 | + |
| 109 | +1. **What to change** — Which file(s) to edit and what to add/modify |
| 110 | +2. **Why** — What pattern it addresses, with evidence from the runs |
| 111 | +3. **Estimated impact** — How much time it would save per run |
| 112 | + |
| 113 | +## Output |
| 114 | + |
| 115 | +Present the full consolidated report. Do NOT edit any workflow or skill files — only report findings and recommendations. The user will decide which changes to apply. |
0 commit comments