Skip to content

Commit 3bf7915

Browse files
033: from-description chat flow & doc-aware manual tests
Add dedicated SKILL section + agent intent routing for `spectra ai generate --from-description`, and enhance `UserDescribedGenerator` to load matching documentation and acceptance criteria as best-effort formatting context. Manual tests now get populated source_refs and criteria fields when context is available; grounding.verdict stays "manual" regardless. - spectra-generate SKILL: new "create a specific test case" section + intent routing table (focus vs from-description vs from-suggestions). - spectra-generation agent: new Test Creation Intent Routing section with three intents and explicit "do NOT ask about count or scope" rule. - UserDescribedGenerator: new public static BuildPrompt() and optional documentContext / criteriaContext / sourceRefPaths params on GenerateAsync. - GenerateHandler.ExecuteFromDescriptionAsync: best-effort loads docs (cap 3 × 8000 chars via SourceDocumentLoader) and criteria (via existing LoadCriteriaContextAsync). Failures swallowed. - 19 new tests (9 prompt-builder, 10 SKILL/agent content). Generation agent line-count limit raised 100 → 140 to fit routing rules. 1453 total passing.
1 parent eb841f5 commit 3bf7915

File tree

15 files changed

+929
-21
lines changed

15 files changed

+929
-21
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,7 @@ spectra config list-automation-dirs # List dirs with existence s
209209
- **Tests:** xUnit with structured results (never throw on validation errors)
210210

211211
## Recent Changes
212+
- 033-from-description-chat-flow: ✅ COMPLETE - From-description chat flow & doc-aware manual tests. Updated `spectra-generate` SKILL with dedicated "When the user wants to create a specific test case" section (numbered 5-step sequence) and intent-routing table mapping topic-vs-scenario signals to `--focus`, `--from-description`, or `--from-suggestions`. Updated `spectra-generation` agent prompt with new "Test Creation Intent Routing" section (Intent 1: explore area → `--focus`; Intent 2: specific test → `--from-description`; Intent 3: from suggestions → `--from-suggestions`) and explicit "do NOT ask about count or scope" rule. Enhanced `UserDescribedGenerator` with new public static `BuildPrompt()` method (testable prompt construction) and optional `documentContext` / `criteriaContext` / `sourceRefPaths` parameters on `GenerateAsync()`. `GenerateHandler.ExecuteFromDescriptionAsync` now best-effort loads matching documentation (capped at 3 docs × 8000 chars via `SourceDocumentLoader`) and acceptance criteria (via existing `LoadCriteriaContextAsync`) before calling the generator — failures are swallowed (best-effort, non-blocking). Resulting tests get populated `source_refs` (from loaded doc paths) and `criteria` (from AI-matched IDs) when context is available; `grounding.verdict` remains `manual` regardless. New `FilterDocsForSuite` and `FormatDocContext` private helpers in `GenerateHandler`. New tests: `UserDescribedGeneratorTests` (9 prompt-builder tests) and `GenerateSkillContentTests` (10 SKILL/agent content tests). `GenerationAgent_LineCount` limit raised 100→140 to fit the new routing section. 19 new tests. 1453 total tests passing.
212213
- 032-quickstart-skill-usage-guide: ✅ COMPLETE - Quickstart SKILL & USAGE.md offline guide. New `spectra-quickstart` SKILL (12th bundled SKILL) — workflow-oriented onboarding that responds to "help me get started", "tutorial", "walk me through" with 12 workflow walkthroughs and example conversations. Teaching-only (no CLI execution); delegates actual workflow execution to the corresponding workflow SKILLs. New `USAGE.md` bundled doc written to project root by `spectra init` (offline mirror of the quickstart SKILL, free of in-chat tool references). Both artifacts hash-tracked by the existing `update-skills` system. New `ProfileFormatLoader.LoadEmbeddedUsageGuide()` method. New `InitHandler.CreateUsageGuideAsync` (gated by `--skip-skills`). Generation and execution agent prompts gain a `**QUICKSTART**` delegation line directing onboarding intents to the new SKILL. Updated SKILL count test (11→12). 7 new tests (quickstart SKILL content, USAGE.md content + offline-clean assertions, init creates both files, --skip-skills skips both files, both agents reference quickstart). 1434 total tests passing.
213214
- 030-prompt-templates: ✅ COMPLETE - Customizable root prompt templates. Introduced `.spectra/prompts/` directory with 5 markdown templates (behavior-analysis, test-generation, criteria-extraction, critic-verification, test-update) controlling all AI operations. Templates use `{{placeholder}}`, `{{#if}}`, `{{#each}}` syntax with built-in defaults as embedded resources. New `PlaceholderResolver`, `PromptTemplateParser`, `PromptTemplateLoader`, `BuiltInTemplates` in `Spectra.CLI/Prompts/`. Replaced hardcoded prompts in `BehaviorAnalyzer`, `CopilotGenerationAgent`, `CriteriaExtractor`, `CriticPromptBuilder` with template-driven approach (legacy fallback preserved). New `analysis.categories` config section with 6 default categories (happy_path, negative, edge_case, boundary, error_handling, security). New `spectra prompts list/show/reset/validate` CLI commands with JSON output. New `spectra-prompts` SKILL (11th bundled SKILL). Init creates `.spectra/prompts/` with defaults. `update-skills` tracks template hashes for safe updates. 65+ new tests. 1417 total tests passing.
214215
- 029-spectra-update-skill: ✅ COMPLETE - Added spectra-update SKILL (10th bundled SKILL) for test update workflow via Copilot Chat. SKILL wraps `spectra ai update` with progress page, result file, classification breakdown (UP_TO_DATE, OUTDATED, ORPHANED, REDUNDANT). Agent delegation tables updated (both generation and execution agents delegate update requests to SKILL). Extended `UpdateResult` with `success`, `totalTests`, `testsFlagged`, `flaggedTests`, `duration` fields. Generation agent inline update section replaced with delegation row. 6 new tests (SKILL content, step format, do-NOTHING instruction, tools list, agent delegation). Documentation updated (SKILL count 9→10). Version 1.35.0.

PROJECT-KNOWLEDGE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,7 @@ Three-section unified coverage with distinct semantics:
310310

311311
| # | Feature | Key Changes |
312312
|---|---------|-------------|
313+
| 033 | From-Description Chat Flow | Dedicated `--from-description` SKILL section, agent intent routing (focus vs from-description vs from-suggestions), doc-aware manual tests with populated `source_refs` and `criteria` (verdict stays manual) |
313314
| 029 | spectra-update SKILL (10th) | Agent delegation, documentation sync, version 1.35.0 |
314315
| 028 | Coverage & Criteria Pipeline | Fixed criteria propagation in parser, wired criteria into generation pipeline, always write criteria: [] |
315316
| 027 | SKILL/Agent Deduplication | Agents delegate to SKILLs, execution ~120 lines, generation ~81 lines, SKILL consistency fixes |

docs/cli-reference.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,8 @@ Session state is stored in `.spectra/session.json` and expires after 1 hour.
179179

180180
User-described tests are marked with `grounding.verdict: manual` and `source: user-described`.
181181

182+
When a project has documentation in `docs/` and acceptance criteria in `docs/criteria/`, `--from-description` runs in **doc-aware mode**: it best-effort loads matching docs (capped at 3 docs × 8000 chars) and matching `.criteria.yaml` entries as formatting context, then populates the new test's `source_refs` (with the doc paths used) and `criteria` fields (with any IDs the AI matches to your description). The grounding verdict stays `manual` — doc context is used for terminology and navigation alignment only, never for verification. If no docs or criteria exist, the flow is identical to the no-context behavior.
183+
182184
Duplicate detection warns when a new test has >80% title similarity to an existing test.
183185

184186
**Exit codes:** `0` = success, `1` = error, `3` = missing required args with `--no-interaction`.

docs/skills-integration.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,20 @@ spectra ai generate --suite {suite} --from-description "{text}" --context "{ctx}
7878
spectra ai generate --suite {suite} --auto-complete --output-format json --verbosity quiet
7979
```
8080

81+
### Intent Routing in Chat (spec 033)
82+
83+
The `spectra-generate` SKILL contains a dedicated section for `--from-description` and an intent-routing table that the `spectra-generation` agent uses to choose between flows:
84+
85+
| User intent | Signal | Flow |
86+
|-------------|--------|------|
87+
| Explore a feature area | "Generate tests for...", "Cover... module" | Main analyze → generate flow with `--focus` |
88+
| Create a specific test | "Add a test for...", "I need a test that verifies..." | `--from-description` (1 test, no analysis, no count question) |
89+
| Generate from suggestions | "Use the previous suggestions" | `--from-suggestions` |
90+
91+
**Key rule**: if you can read the user's request as a single test case title, the agent routes to `--from-description`. If it's a topic to explore, the agent routes to `--focus`. The agent never asks the user for count or scope to disambiguate — the topic-vs-scenario shape is the only signal.
92+
93+
When `--from-description` runs in a project that has documentation and acceptance criteria, the CLI best-effort loads matching docs (capped at 3 docs × 8000 chars) and matching `.criteria.yaml` entries as formatting context. The resulting test case has populated `source_refs` and `criteria` fields, but `grounding.verdict` stays `manual` — doc context is used for terminology alignment only, never for verification.
94+
8195
### Non-Interactive Mode
8296

8397
For CI pipelines and automated workflows:
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Specification Quality Checklist: From-Description Chat Flow & Doc-Aware Manual Tests
2+
3+
**Purpose**: Validate specification completeness and quality before proceeding to planning
4+
**Created**: 2026-04-10
5+
**Feature**: [spec.md](../spec.md)
6+
7+
## Content Quality
8+
9+
- [x] No implementation details (languages, frameworks, APIs)
10+
- [x] Focused on user value and business needs
11+
- [x] Written for non-technical stakeholders
12+
- [x] All mandatory sections completed
13+
14+
## Requirement Completeness
15+
16+
- [x] No [NEEDS CLARIFICATION] markers remain
17+
- [x] Requirements are testable and unambiguous
18+
- [x] Success criteria are measurable
19+
- [x] Success criteria are technology-agnostic (no implementation details)
20+
- [x] All acceptance scenarios are defined
21+
- [x] Edge cases are identified
22+
- [x] Scope is clearly bounded
23+
- [x] Dependencies and assumptions identified
24+
25+
## Feature Readiness
26+
27+
- [x] All functional requirements have clear acceptance criteria
28+
- [x] User scenarios cover primary flows
29+
- [x] Feature meets measurable outcomes defined in Success Criteria
30+
- [x] No implementation details leak into specification
31+
32+
## Notes
33+
34+
- Spec references implementation file names (`UserDescribedGenerator`, `GenerateHandler`, `LoadCriteriaContextAsync`) in the Assumptions section. These are intentional anchor references for the implementer; the user-facing requirements (FR-001..FR-017) and success criteria are technology-agnostic.
35+
- All items pass on first iteration. Spec is ready for `/speckit.plan`.
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
# Implementation Plan: From-Description Chat Flow & Doc-Aware Manual Tests
2+
3+
**Branch**: `033-from-description-chat-flow` | **Date**: 2026-04-10 | **Spec**: [spec.md](./spec.md)
4+
5+
## Summary
6+
7+
Three-part feature: (1) update `spectra-generate` SKILL with a dedicated single-test "from-description" flow and an intent routing table, (2) update the `spectra-generation` agent prompt with explicit intent-classification rules, (3) enhance `UserDescribedGenerator` to load relevant docs and acceptance criteria as best-effort formatting context — populating `source_refs` and `criteria` on the resulting test while keeping `grounding.verdict: manual`.
8+
9+
## Technical Context
10+
11+
**Language/Version**: C# 12, .NET 8
12+
**Primary Dependencies**: Spectra.CLI (existing), Spectra.Core (TestCase, GroundingMetadata, AcceptanceCriterion)
13+
**Storage**: File-based — embedded SKILL/agent `.md` resources in `Spectra.CLI`; SHA-256 hashes computed at install time
14+
**Testing**: xUnit (`Spectra.CLI.Tests`)
15+
**Target Platform**: Cross-platform .NET CLI
16+
**Project Type**: CLI (single project)
17+
**Constraints**: Best-effort doc/criteria loading must not block or fail the command. Doc context capped at 3 docs × 8000 chars.
18+
**Scale/Scope**: ~3 source files modified, 1 SKILL md file, 1 agent md file, ~10 new tests, 9 doc files updated.
19+
20+
## Constitution Check
21+
22+
No constitution file. Standard CLAUDE.md guidelines apply: no unnecessary refactors, only test the changed paths, prefer small focused changes.
23+
24+
## Project Structure
25+
26+
### Documentation (this feature)
27+
28+
```text
29+
specs/033-from-description-chat-flow/
30+
├── spec.md
31+
├── plan.md
32+
├── tasks.md
33+
├── checklists/
34+
│ └── requirements.md
35+
└── (no contracts/ — internal CLI feature, no API)
36+
```
37+
38+
### Source Code (touched paths)
39+
40+
```text
41+
src/Spectra.CLI/
42+
├── Commands/Generate/
43+
│ ├── UserDescribedGenerator.cs # MODIFIED: add documentContext + criteriaContext params, refactor prompt builder for testability
44+
│ └── GenerateHandler.cs # MODIFIED: ExecuteFromDescriptionAsync loads doc + criteria context, populates source_refs
45+
└── Skills/Content/
46+
├── Skills/spectra-generate.md # MODIFIED: add "create a specific test case" section + routing table
47+
└── Agents/spectra-generation.agent.md # MODIFIED: add Test Creation Intent Routing section
48+
49+
tests/Spectra.CLI.Tests/
50+
├── Commands/Generate/
51+
│ └── UserDescribedGeneratorTests.cs # NEW: prompt-building tests
52+
└── Skills/
53+
└── GenerateSkillContentTests.cs # NEW: SKILL/agent content assertions
54+
```
55+
56+
## Phases
57+
58+
### Phase 0 — Research / discovery
59+
60+
No external research needed. All primitives exist:
61+
- `SourceDocumentLoader.LoadAllAsync(basePath, maxDocuments, maxContentLengthPerDoc, ct)` already supports caps.
62+
- `LoadCriteriaContextAsync` (private static in `GenerateHandler`, line 1943) is the criteria primitive — promote to internal/static-helper-callable from the from-description branch.
63+
- `SkillContent` / `AgentContent` already load embedded resources via `SkillResourceLoader`. SHA-256 hashes are computed at install time, not stored — so editing the `.md` resources is sufficient; no manifest table to regenerate.
64+
65+
### Phase 1 — SKILL & agent content (no code changes beyond .md files)
66+
67+
1. Add new section to `Skills/Content/Skills/spectra-generate.md`:
68+
- Heading: `## When the user wants to create a specific test case`
69+
- Numbered Step 1..5 sequence (open progress page → runInTerminal → awaitTerminal → readFile → present).
70+
- Command line: `spectra ai generate --suite {suite} --from-description "{description}" --context "{context}" --no-interaction --output-format json --verbosity quiet`.
71+
- Explicit "Do NOT run analysis. Do NOT ask how many tests. Always 1 test." line.
72+
- Routing table mapping intent signal → flow.
73+
74+
2. Add new section to `Skills/Content/Agents/spectra-generation.agent.md`:
75+
- Heading: `## Test Creation Intent Routing`.
76+
- Three intent classes (Intent 1: explore area → `--focus`, Intent 2: specific test → `--from-description`, Intent 3: from suggestions → `--from-suggestions`) with examples and actions.
77+
- Ambiguous-intent rule: topic-vs-scenario; never ask about count.
78+
79+
3. Verify SkillContent/AgentContent dictionaries still resolve (smoke test in build).
80+
81+
### Phase 2 — Doc-aware `--from-description` (CLI code)
82+
83+
1. **`UserDescribedGenerator.cs`** — refactor:
84+
- Add public `static string BuildPrompt(string description, string? context, string suite, IReadOnlyCollection<string> existingIds, string? documentContext, string? criteriaContext)` method that returns the AI prompt string. This makes prompt construction testable without invoking AI.
85+
- Add optional parameters `string? documentContext = null`, `string? criteriaContext = null`, and `IReadOnlyList<string>? sourceRefPaths = null` to `GenerateAsync(...)`.
86+
- When `documentContext` is non-null: insert "## Reference Documentation (for formatting context only)" section in the prompt.
87+
- When `criteriaContext` is non-null: insert "## Related Acceptance Criteria" section.
88+
- When `sourceRefPaths` is non-null: populate the returned `TestCase.SourceRefs` from those paths instead of `[]`.
89+
- Keep AI's `criteria` output (already populated by `agent.GenerateTestsAsync`) flowing into `TestCase.Criteria`.
90+
- Keep `grounding.verdict = Manual` unconditionally.
91+
92+
2. **`GenerateHandler.cs`** — modify `ExecuteFromDescriptionAsync`:
93+
- Promote `LoadCriteriaContextAsync` from `private static` to allow reuse, OR call directly (it is already in the same class).
94+
- After loading config, before calling `generator.GenerateAsync`, perform best-effort load:
95+
```csharp
96+
string? docContext = null;
97+
IReadOnlyList<string> docPaths = [];
98+
try
99+
{
100+
var loader = new SourceDocumentLoader(config.Source);
101+
var allDocs = await loader.LoadAllAsync(currentDir, maxDocuments: null, maxContentLengthPerDoc: 8000, ct);
102+
var matching = allDocs
103+
.Where(d => MatchesSuite(d, suite))
104+
.Take(3)
105+
.ToList();
106+
if (matching.Count > 0)
107+
{
108+
docContext = FormatDocContext(matching);
109+
docPaths = matching.Select(d => d.Path).ToList();
110+
}
111+
}
112+
catch { /* best-effort */ }
113+
114+
string? criteriaContext = null;
115+
try { criteriaContext = await LoadCriteriaContextAsync(currentDir, suite, config, ct); }
116+
catch { /* best-effort */ }
117+
```
118+
- `MatchesSuite` is a small private helper: case-insensitive contains on `doc.Path` filename or `doc.Title`.
119+
- `FormatDocContext` produces a delimited string of `## {title}\n{content}\n`.
120+
- Pass `docContext`, `criteriaContext`, `docPaths` to `generator.GenerateAsync`.
121+
122+
3. **JSON result**no shape change. `source_refs` and `criteria` are persisted via `TestFileWriter`, which already writes them. No `GenerateResult` schema change needed.
123+
124+
### Phase 3 — Tests
125+
126+
1. **`UserDescribedGeneratorTests`** (new):
127+
- `BuildPrompt_WithoutContext_DoesNotIncludeRefSection`
128+
- `BuildPrompt_WithDocContext_IncludesRefDocumentationSection`
129+
- `BuildPrompt_WithCriteriaContext_IncludesAcceptanceCriteriaSection`
130+
- `BuildPrompt_WithBothContexts_IncludesBoth`
131+
- `BuildPrompt_IncludesUserDescriptionAsSourceOfTruth`
132+
133+
2. **`GenerateSkillContentTests`** (new):
134+
- `GenerateSkill_HasFromDescriptionSection` — asserts `SkillContent.Generate.Contains("create a specific test case")`.
135+
- `GenerateSkill_HasIntentRoutingTable` — asserts the table headers ("User intent", "Signal", "Flow") all present.
136+
- `GenerateSkill_FromDescriptionUsesCorrectFlags` — asserts `--from-description` line contains `--no-interaction` and `--output-format json` and `--verbosity quiet`.
137+
- `GenerationAgent_HasIntentRoutingSection` — asserts agent content contains "Test Creation Intent Routing" + "--from-description" + "--focus".
138+
- `GenerationAgent_RoutesToFromDescriptionForSpecificTest` — asserts agent content includes the example "Add a test for".
139+
- `GenerationAgent_DoesNotAskAboutCountInRoutingRules` — asserts the "do NOT ask clarifying questions about count" instruction exists.
140+
141+
3. **Integration tests**deferred. The from-description path invokes AgentFactory which requires real AI. Coverage of FR-008..FR-014 is via the prompt-building unit tests + manual smoke; no integration test will be added in this spec to keep the test suite isolated from network/AI dependencies.
142+
143+
### Phase 4 — Documentation updates
144+
145+
Update the 9 doc files listed in the spec:
146+
- `CLAUDE.md` — add 033 to Recent Changes.
147+
- `PROJECT-KNOWLEDGE.md` — add 033 implemented entry.
148+
- `README.md` — add "create a specific test" example near Quick Start.
149+
- `docs/getting-started.md` — add from-description example.
150+
- `docs/cli-reference.md` — note `--from-description` doc/criteria context.
151+
- `docs/skills-integration.md` — describe new from-description flow + intent routing.
152+
- `docs/test-format.md` — note `source_refs`/`criteria` may be populated for manual tests.
153+
- `docs/cli-vs-chat-generation.md` — update Dimension 8.
154+
- `docs/coverage.md` — note manual tests can now contribute to coverage.
155+
156+
(If any of these files do not exist, skip that line itemthey are optional polish.)
157+
158+
## Risks & Mitigations
159+
160+
| Risk | Mitigation |
161+
|------|------------|
162+
| Doc loading slows from-description noticeably | Cap at 3 docs × 8000 chars; load synchronously inside best-effort try block; no timeout needed since file I/O is bounded. |
163+
| AI emits criteria IDs that don't exist | Acceptable — coverage analyzer will simply not match them. The criteria context tells the AI which IDs are valid, so this should be rare. |
164+
| SKILL .md changes break existing skill content tests | Search existing tests for hardcoded SKILL strings before edit; update them in the same change. |
165+
| Refactoring `BuildPrompt` to static breaks existing call site | The existing `GenerateAsync` will still build the prompt internally (calling `BuildPrompt`), so call sites are unchanged. |

0 commit comments

Comments
 (0)