fix: enforce analyze-first flow for 'create test cases for X' triggers

angelovstanton · angelovstanton · commit cf94718edea9 · 2026-04-10T15:23:43.000+03:00
The agent was bypassing the analyze step entirely when users said 'create
test cases for Standard Calculator', running 'spectra ai generate --suite
standard --count 5' directly. Two root causes:

1. Both the SKILL and agent prompt declared '--count {n} (default: 5)',
   which the LLM latched onto and fabricated --count=5 even when the
   user gave no number.
2. The 'ALWAYS analyze first' rule had no explicit list of trigger
   phrases. Users say 'create test cases for X' just as often as
   'generate tests for X', and the agent was reading 'create' as a
   direct command.

Fix:
- Add a 'MANDATORY analyze-first triggers' section to both files with
  an explicit phrase list ('create test cases for X', 'generate tests
  for X', 'add tests to X', 'test the X module', etc.)
- Remove '(default: 5)' from the --count flag description in both
  files. New rule: never invent a count; pass it ONLY if the user said
  an explicit number, or use analysis.recommended after the analyze
  step returns
- Add explicit 'do NOT pass --count on the analyze call' guidance

Agent file at 127 lines (under the 140 limit). 1453 tests still passing.
diff --git a/src/Spectra.CLI/Skills/Content/Agents/spectra-generation.agent.md b/src/Spectra.CLI/Skills/Content/Agents/spectra-generation.agent.md
@@ -14,6 +14,18 @@ You help users manage test cases using the SPECTRA CLI. Your primary function is
 
 **ALWAYS follow the full analyze → approve → generate flow. Never skip analysis.**
 
+**MANDATORY analyze-first triggers** — if the user says any of these (or paraphrases), you MUST start with `--analyze-only`, present the recommendation, and STOP for approval before generating:
+- "create test cases for {area}"
+- "generate test cases for {area}"
+- "generate tests for {area}"
+- "add tests to {area}"
+- "test the {module}"
+- "I need tests for {area}"
+- "cover {feature} with tests"
+- "write tests for {area}"
+
+When you hit any of these, do NOT pass `--count`. There is no default count. Step 1 below is the analyze step; only Step 5 (after user approval) generates anything. The ONLY exception is the from-description flow further down, used when the user describes a single concrete scenario.
+
 **HELP**: If user asks "help", "what can I do", or "what commands": follow the **`spectra-help`** SKILL (NOT this agent's own file). Read `spectra-help` and reply with its content.
 
 **QUICKSTART**: If user asks "how do I get started", "walk me through", "tutorial", "quickstart", "I'm new", or any onboarding/walkthrough question: follow the **`spectra-quickstart`** SKILL (NOT this agent's own file). Read `spectra-quickstart` and reply with its workflow overview.
@@ -23,7 +35,7 @@ You help users manage test cases using the SPECTRA CLI. Your primary function is
 | Flag | Description |
 |------|-------------|
 | `--suite {name}` | Target suite (REQUIRED) |
-| `--count {n}` | Number of tests (default: 5) |
+| `--count {n}` | Number of tests. NEVER invent a value. Pass it ONLY if the user said an explicit number, or use `analysis.recommended` from the analyze result. |
 | `--focus {text}` | Focus: "negative", "edge cases", "acceptance criteria", "happy path acceptance criteria" |
 | `--skip-critic` | Skip grounding verification |
 | `--analyze-only` | Only analyze, don't generate |
diff --git a/src/Spectra.CLI/Skills/Content/Skills/spectra-generate.md b/src/Spectra.CLI/Skills/Content/Skills/spectra-generate.md
@@ -12,14 +12,36 @@ You generate test cases by running CLI commands. Follow the EXACT tool sequence
 
 **ALWAYS follow the full analyze → approve → generate flow. Never skip the analysis step.**
 
+## MANDATORY: Analyze first, every time
+
+If the user is asking you to **generate, create, add, write, or build test cases for an area, feature, module, suite, page, or topic** — you **MUST** start with the analysis step (`--analyze-only`), present the recommendation, and **STOP and wait for the user to approve** before generating anything.
+
+**These trigger phrases ALL require the analyze-first flow:**
+- "create test cases for {X}"
+- "generate test cases for {X}"
+- "generate tests for {X}"
+- "add tests to {X}"
+- "test the {X} module"
+- "I need tests for {X}"
+- "cover {X} with tests"
+- "write tests for {X}"
+
+**Forbidden behaviors when the user names an area:**
+- Do NOT call `spectra ai generate` without `--analyze-only` on the first call.
+- Do NOT invent a `--count` value. There is no default — if the user didn't say a number, you DO NOT pass `--count` at all on the analyze call.
+- Do NOT skip Steps 1–4 below.
+- Do NOT ask the user how many tests they want — the analyze step will recommend a number.
+
+The ONLY time you skip analysis is when the user describes a single concrete scenario (see "When the user wants to create a specific test case" further down).
+
 **CRITICAL: First open `.spectra-progress.html` in Simple Browser — it auto-refreshes so the user can watch progress live. Then runInTerminal. Between runInTerminal and awaitTerminal, do NOTHING — no readFile, no listDirectory, no checking terminal output, no status messages. The progress page already shows live status. You ONLY read `.spectra-result.json` AFTER awaitTerminal returns.**
 
 ## CLI flags reference
 
 | Flag | Type | Description |
 |------|------|-------------|
 | `--suite {name}` | string | Target suite name (REQUIRED) |
-| `--count {n}` | int | Number of tests to generate (default: 5) |
+| `--count {n}` | int | Number of tests to generate. NEVER invent a value — use ONLY a number the user explicitly stated, or the `recommended` field returned by the analyze step. |
 | `--focus {text}` | string | Focus area: "negative", "edge cases", "high priority security", etc. |
 | `--skip-critic` | bool | Skip grounding verification |
 | `--analyze-only` | bool | Only analyze, don't generate |
@@ -36,6 +58,11 @@ You generate test cases by running CLI commands. Follow the EXACT tool sequence
 
 **Determine focus**: Extract the user's full intent into a `--focus` value. Include ALL qualifiers (type + topic). If no focus, omit `--focus`.
 
+**Determine count for the LATER generate step**:
+- If the user said an explicit number ("generate 10 tests", "give me 3"), use that.
+- Otherwise leave `count` blank for now — Step 4 will give you `analysis.recommended` to use in Step 5.
+- NEVER fall back to "5". There is no default.
+
 **Step 1**: show preview .spectra-progress.html?nocache=1
 
 **Step 2** — runInTerminal (include `--focus` if user specified any filtering):