Skip to content

Commit 08cbb97

Browse files
Merge branch '039-unify-critic-providers'
2 parents 756c40b + a04c11c commit 08cbb97

File tree

18 files changed

+944
-34
lines changed

18 files changed

+944
-34
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,7 @@ spectra config list-automation-dirs # List dirs with existence s
209209
- **Tests:** xUnit with structured results (never throw on validation errors)
210210

211211
## Recent Changes
212+
- 039-unify-critic-providers: ✅ COMPLETE - Aligned the critic provider validation set with the generator provider list. `CriticFactory.SupportedProviders` now contains exactly the canonical 5 (`github-models`, `azure-openai`, `azure-anthropic`, `openai`, `anthropic`) — same as `ai.providers[].name`. Removed `azure-deepseek` (not in the generator set) and `google` (Copilot SDK cannot route to it). New private `LegacyAliases` dictionary maps `github` → `github-models` (soft alias with one-line stderr deprecation warning); new private `HardErrorProviders` set contains `google` and produces a hard validation error listing all 5 supported providers. New `internal static ResolveProvider(string?)` helper centralizes the lookup; called from both `TryCreate` and `TryCreateAsync` (in the async path BEFORE the Copilot availability check so unknown providers fail fast). New public `DefaultProvider = "github-models"` constant; empty/whitespace input falls back to it. Case-insensitive matching with lowercase normalization. `IsSupported` returns true for canonical names AND legacy `github`, false for `google`/unknown. `CriticConfig.Provider` XML doc updated to list the canonical 5; `GetEffectiveModel()` and `GetDefaultApiKeyEnv()` switch statements gain explicit cases for `github-models`, `azure-openai`, `azure-anthropic` (defaults: `gpt-4o-mini`/`gpt-4o-mini`/`claude-3-5-haiku-latest` and `GITHUB_TOKEN`/`AZURE_OPENAI_API_KEY`/`AZURE_ANTHROPIC_API_KEY`); legacy `github`/`google` cases retained for read-side safety. Existing `CriticFactoryTests`, `CriticAuthTests`, `VerificationIntegrationTests`, `CriticConfigTests` updated to drop `google` from "supported" assertions and add `azure-openai`/`azure-anthropic` to the canonical theory data. Docs updated: `docs/configuration.md` lists the canonical 5 with two new examples (Azure-only billing, cross-provider critic); `docs/grounding-verification.md` example uses `github-models` instead of `google`, supported list updated, legacy-values note added. New `CriticFactoryProviderTests.cs` with 8 tests (azure-openai accept, azure-anthropic accept, case-insensitive normalization, github→github-models alias with stderr warning, google hard error with 5-provider listing, unknown provider error, empty fallback, whitespace fallback). Backward compatible: all existing configurations on `openai`/`anthropic`/`github-models` continue to work unchanged. 1551 total tests passing (491 Core + 709 CLI + 351 MCP).
212213
- 038-testimize-integration: ✅ COMPLETE - Optional integration with the external Testimize.MCP.Server tool for algorithmic test data optimization (BVA, EP, pairwise, ABC). Disabled by default; zero impact on users who do not opt in. New `testimize` section in `spectra.config.json` (`enabled` defaults to `false`, `mode`, `strategy`, optional `settings_file`, `mcp.command`/`args`, optional `abc_settings`). New models in `src/Spectra.Core/Models/Config/TestimizeConfig.cs` (`TestimizeConfig`, `TestimizeMcpConfig`, `TestimizeAbcSettings`). New `SpectraConfig.Testimize` property defaulting to `new()`. New `src/Spectra.CLI/Agent/Testimize/TestimizeMcpClient.cs` — `IAsyncDisposable` wrapper around the Testimize child process with stdio JSON-RPC framing (`Content-Length` header), `StartAsync` (returns false on missing tool, never throws), `IsHealthyAsync` (5s budget), `CallToolAsync` (30s budget, returns null on any failure), idempotent `DisposeAsync` (kills child process tree). New `src/Spectra.CLI/Agent/Testimize/TestimizeTools.cs` with two `AIFunction` factories: `CreateGenerateTestDataTool` (forwards to Testimize MCP `generate_hybrid_test_cases`/`generate_pairwise_test_cases`/`generate_combinatorial_test_cases` based on strategy) and `CreateAnalyzeFieldSpecTool` (local heuristic extractor for "X to Y characters", "between X and Y", "valid email", "required field"). Tool description explicitly mentions "boundary values", "equivalence classes", and "pairwise" so the model knows when to call them. New `TestimizeDetector.IsInstalledAsync` shells out to `dotnet tool list -g` (5s timeout, never throws). `CopilotGenerationAgent.GenerateTestsAsync` conditionally registers the Testimize tools after the existing 7 tools when `_config.Testimize.Enabled` and the MCP client passes a health probe; the client is disposed in a `finally` block on every exit path (success, exception, cancellation). `BehaviorAnalyzer.BuildAnalysisPrompt` and `GenerationAgent.BuildFullPrompt` populate a new `testimize_enabled` placeholder (`"true"` or `""`). `behavior-analysis.md` and `test-generation.md` templates gain `{{#if testimize_enabled}}` blocks with technique-aware instructions. New `spectra testimize check` CLI command (`src/Spectra.CLI/Commands/Testimize/TestimizeCheckHandler.cs` + `TestimizeCommand.cs`) reports enabled/installed/healthy/mode/strategy/settings status, supports `--output-format json`, never starts the MCP process when disabled (FR-028), includes install instruction when not installed. New `TestimizeCheckResult` typed result. Embedded `Templates/spectra.config.json` updated to include the `testimize` section with `enabled: false` so `spectra init` writes it by default. Existing `spectra init` already refuses to overwrite an existing config without `--force`, so re-init preservation is automatic. Graceful degradation everywhere: tool not installed → warning + AI fallback; process crashes mid-generation → catch + AI fallback; server returns null/empty/garbage → AI fallback; per-call 30s timeout → AI fallback. Exit code 0 in all degradation paths. New tests: `TestimizeConfigTests` (6), `TestimizeMcpClientTests` (4), `TestimizeMcpClientGracefulTests` (4), `TestimizeToolsTests` (2), `AnalyzeFieldSpecTests` (7), `TestimizeDetectorTests` (1), `TestimizeConditionalBlockTests` (4), `TestimizeCheckHandlerTests` (3) — 31 net new tests (488 Core + 699 CLI + 351 MCP = 1538 total passing).
213214
- 037-istqb-test-techniques: ✅ COMPLETE - ISTQB test design techniques in prompt templates. All five built-in prompt templates (`src/Spectra.CLI/Prompts/Content/*.md`) updated to teach the AI six ISTQB black-box techniques: Equivalence Partitioning (EP), Boundary Value Analysis (BVA), Decision Table (DT), State Transition (ST), Error Guessing (EG), Use Case (UC). `behavior-analysis.md` rewritten with six TECHNIQUE sections, distribution guidelines (40%-of-any-category cap, ≥4 BVA per range, mandatory invalid state transitions), and a new `technique` field in the JSON output schema. `test-generation.md` adds a "TEST DESIGN TECHNIQUE RULES" section with WRONG/RIGHT examples for BVA exact values, EP class naming, DT condition values, ST starting/action/resulting state, and EG concrete error scenarios. `test-update.md` adds a "Technique Completeness Check" so updates flag tests as OUTDATED when docs introduce new ranges/rules/states. `critic-verification.md` adds a "Technique Verification" section returning PARTIAL for BVA boundary mismatches and HALLUCINATED for DT conditions not in docs. `criteria-extraction.md` adds optional `technique_hint` per criterion. New `IdentifiedBehavior.Technique` string field (default `""` for backward compatibility), new `BehaviorAnalysisResult.TechniqueBreakdown` map (excludes empty techniques), new `AcceptanceCriterion.TechniqueHint` `string?` (omitted from YAML when null). `BehaviorAnalyzer.BuildAnalysisPrompt` legacy fallback updated with condensed ISTQB instructions. `GenerateAnalysis.TechniqueBreakdown` always serialized as `{}` when empty for stable SKILL/CI contract. `AnalysisPresenter` and `ProgressPageWriter` render a "Technique Breakdown" section beneath the existing category breakdown in fixed display order BVA, EP, DT, ST, EG, UC. `CriteriaExtractor` parses and normalizes the technique hint; `LoadCriteriaContextAsync` forwards `technique_hint` into the generation prompt. `BuildPrompt` reinforces ISTQB technique application in the per-batch user prompt. `spectra-generate` SKILL shows both category and technique breakdowns from analysis JSON. `spectra-help` SKILL gains an "ISTQB Test Design Techniques" section pointing users to `spectra prompts reset --all`. `spectra-quickstart` SKILL flagged as updated with the new techniques. `spectra-generation` agent prompt notes the technique breakdown. Existing user-customized `.spectra/prompts/*.md` files are NOT auto-overwritten (spec 022 hash-tracking preserved); users opt in via `spectra prompts reset` or `spectra update-skills`. New tests: `BehaviorAnalysisTemplateTests` (6), `BehaviorAnalyzerTechniqueTests` (3), `BehaviorAnalyzerLegacyFallbackTests` (3), `BehaviorAnalysisResultTechniqueTests` (2), `GenerateResultTechniqueBreakdownTests` (3), `ProgressPageTechniqueBreakdownTests` (3), `TestGenerationTemplateTests` (6), `PromptTemplateMigrationTests` (2), `TestUpdateTemplateTests` (5), `CriticVerificationTemplateTests` (5), `CriteriaExtractionTemplateTests` (6), `AcceptanceCriterionTechniqueHintTests` (4). One existing test (`BuildPrompt_WithoutLoader_UsesLegacy`) updated to assert on the new `boundary` and `error_handling` categories (replacing the dropped `performance` assertion). 48 new tests. 1507 total tests passing (482 Core + 674 CLI + 351 MCP).
214215
- 034-github-pages-docs: ✅ COMPLETE - GitHub Pages documentation site. Deployed existing `docs/` markdown files as a branded documentation site using Just the Docs Jekyll theme at `automatetheplanet.github.io/Spectra/`. Custom `spectra` color scheme matching ATP brand identity (dark navy sidebar #16213e, teal accents #00bcd4, white content area). Sidebar navigation with hierarchical grouping (User Guide, Architecture, Execution Agents, Deployment sections). Built-in client-side search across all pages. Landing page with quick install and feature overview. New `.github/workflows/docs.yml` (jekyll-build-pages → deploy-pages) auto-deploys on push to `main` when `docs/**` changes; one-time manual setting required (Settings → Pages → Source: GitHub Actions). "Edit this page" links on every page via `gh_edit_*` config. Excluded machine-generated files (`_index.md`, `_index.yaml`, `_index.json`, `criteria/`, `**/*.criteria.yaml`, `_criteria_index.yaml`) from site build. Added YAML frontmatter (`title`, `nav_order`, `parent`) to all 20 existing user-facing doc files in their actual subfolder locations — no files moved. Created 8 new files: `docs/_config.yml`, `docs/Gemfile`, `docs/_sass/color_schemes/spectra.scss`, `docs/index.md`, `docs/user-guide.md`, `docs/architecture.md`, `docs/execution-agents.md`, `docs/deployment.md`. Updated `README.md` with docs site badge and pointer. Documentation-only change — no C# code, no test changes.

docs/configuration.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,12 +114,47 @@ Grounding verification configuration. See [Grounding Verification](grounding-ver
114114
| Property | Type | Default | Description |
115115
|----------|------|---------|-------------|
116116
| `enabled` | bool | `false` | Enable dual-model verification |
117-
| `provider` | string | | Critic provider (`google`, `openai`, `anthropic`, `github`) |
117+
| `provider` | string | `github-models` | Critic provider — same set as the generator: `github-models`, `azure-openai`, `azure-anthropic`, `openai`, `anthropic`. The legacy value `github` is still recognized as an alias for `github-models` (deprecation warning). |
118118
| `model` | string || Critic model identifier |
119119
| `api_key_env` | string || Environment variable for critic API key |
120120
| `base_url` | string || Custom API endpoint |
121121
| `timeout_seconds` | int | `30` | Critic request timeout |
122122

123+
#### Example: Azure-only billing (generator + critic on the same account)
124+
125+
```json
126+
{
127+
"ai": {
128+
"providers": [
129+
{ "name": "azure-openai", "model": "gpt-4.1-mini", "enabled": true }
130+
],
131+
"critic": {
132+
"enabled": true,
133+
"provider": "azure-openai",
134+
"model": "gpt-4o"
135+
}
136+
}
137+
}
138+
```
139+
140+
#### Example: Cross-provider critic for independent verification
141+
142+
```json
143+
{
144+
"ai": {
145+
"providers": [
146+
{ "name": "azure-openai", "model": "gpt-4.1-mini", "enabled": true }
147+
],
148+
"critic": {
149+
"enabled": true,
150+
"provider": "anthropic",
151+
"model": "claude-3-5-haiku-latest",
152+
"api_key_env": "ANTHROPIC_API_KEY"
153+
}
154+
}
155+
}
156+
```
157+
123158
## `generation` — Test Generation Settings
124159

125160
| Property | Type | Default | Description |

docs/grounding-verification.md

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -88,21 +88,29 @@ Configure the critic in `spectra.config.json`:
8888
"ai": {
8989
"critic": {
9090
"enabled": true,
91-
"provider": "google",
92-
"model": "gemini-2.0-flash",
91+
"provider": "github-models",
92+
"model": "gpt-4o-mini",
9393
"timeout_seconds": 30
9494
}
9595
}
9696
}
9797
```
9898

99-
Supported critic providers: `google`, `openai`, `anthropic`, `github`
99+
Supported critic providers (spec 039 — same set as the generator):
100+
`github-models`, `azure-openai`, `azure-anthropic`, `openai`, `anthropic`.
100101

101102
Default API key environment variables:
102-
- Google: `GOOGLE_API_KEY`
103-
- OpenAI: `OPENAI_API_KEY`
104-
- Anthropic: `ANTHROPIC_API_KEY`
105-
- GitHub: `GITHUB_TOKEN`
103+
- `github-models`: `GITHUB_TOKEN`
104+
- `azure-openai`: `AZURE_OPENAI_API_KEY`
105+
- `azure-anthropic`: `AZURE_ANTHROPIC_API_KEY`
106+
- `openai`: `OPENAI_API_KEY`
107+
- `anthropic`: `ANTHROPIC_API_KEY`
108+
109+
> **Legacy values**: `provider: "github"` is accepted as a soft alias for
110+
> `github-models` (with a one-line deprecation warning on stderr). The legacy
111+
> value `provider: "google"` is no longer supported — the Copilot SDK runtime
112+
> cannot route to Google. Update your config to one of the canonical five
113+
> providers above.
106114

107115
## Skip Verification
108116

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Specification Quality Checklist: Unified Critic Provider List
2+
3+
**Purpose**: Validate specification completeness and quality before proceeding to planning
4+
**Created**: 2026-04-11
5+
**Feature**: [spec.md](../spec.md)
6+
7+
## Content Quality
8+
9+
- [x] No implementation details (languages, frameworks, APIs)
10+
- [x] Focused on user value and business needs
11+
- [x] Written for non-technical stakeholders
12+
- [x] All mandatory sections completed
13+
14+
## Requirement Completeness
15+
16+
- [x] No [NEEDS CLARIFICATION] markers remain
17+
- [x] Requirements are testable and unambiguous
18+
- [x] Success criteria are measurable
19+
- [x] Success criteria are technology-agnostic
20+
- [x] All acceptance scenarios are defined
21+
- [x] Edge cases are identified
22+
- [x] Scope is clearly bounded
23+
- [x] Dependencies and assumptions identified
24+
25+
## Feature Readiness
26+
27+
- [x] All functional requirements have clear acceptance criteria
28+
- [x] User scenarios cover primary flows
29+
- [x] Feature meets measurable outcomes defined in Success Criteria
30+
- [x] No implementation details leak into specification
31+
32+
## Notes
33+
34+
- 3 user stories: P1 (Azure-only billing), P2 (legacy compat), P2 (docs alignment).
35+
- Branch number 039 (auto-allocated; source doc was labelled 025).
36+
- Small alignment fix — no architectural change.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{
2+
"$schema": "https://json-schema.org/draft/2020-12/schema",
3+
"$id": "https://spectra.local/contracts/critic-provider.schema.json",
4+
"title": "CriticConfig.provider field",
5+
"description": "Schema for ai.critic.provider in spectra.config.json after spec 039. The five canonical names are accepted (case-insensitive). Legacy 'github' is a soft alias for 'github-models'. Legacy 'google' is rejected.",
6+
"type": "string",
7+
"anyOf": [
8+
{
9+
"enum": ["github-models", "azure-openai", "azure-anthropic", "openai", "anthropic"],
10+
"description": "Canonical provider names — same set as the generator."
11+
},
12+
{
13+
"const": "github",
14+
"description": "Legacy alias — accepted, rewritten to 'github-models', stderr warning emitted."
15+
}
16+
],
17+
"not": {
18+
"const": "google"
19+
}
20+
}
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Data Model: Unified Critic Provider List
2+
3+
**Feature**: 039-unify-critic-providers
4+
**Date**: 2026-04-11
5+
6+
## Overview
7+
8+
No new entities. Two existing files change semantically.
9+
10+
## Entities
11+
12+
### CriticConfig (modified — docstring + switch defaults only)
13+
14+
**File**: `src/Spectra.Core/Models/Config/CriticConfig.cs`
15+
16+
The on-disk JSON schema is unchanged. What changes:
17+
18+
| Field | Change |
19+
|-------|--------|
20+
| `Provider` | Docstring updated from `"google", "openai", "anthropic", "github"` to `"github-models", "azure-openai", "azure-anthropic", "openai", "anthropic"` (with note that legacy `github` and `google` are still recognized). |
21+
| `GetEffectiveModel()` | Switch statement gains explicit cases for `github-models`, `azure-openai`, `azure-anthropic` mapping to the same defaults as the generator. Legacy cases (`google`, `github`) retained for read-side safety. |
22+
| `GetDefaultApiKeyEnv()` | Same shape — gains canonical cases, legacy cases retained. |
23+
24+
### CriticFactory (modified)
25+
26+
**File**: `src/Spectra.CLI/Agent/Critic/CriticFactory.cs`
27+
28+
| Member | Change |
29+
|--------|--------|
30+
| `SupportedProviders` | Reduced from 7 entries to the canonical 5: `github-models`, `azure-openai`, `azure-anthropic`, `openai`, `anthropic`. (Removes `azure-deepseek`, `google`.) |
31+
| `LegacyAliases` (NEW, private static) | Map: `"github" → "github-models"` (soft, with deprecation warning). |
32+
| `HardErrorProviders` (NEW, private static) | Set: `{ "google" }`. |
33+
| `TryCreate(CriticConfig)` | Now validates `config.Provider` against the canonical set + legacy aliases. Returns `Failed(...)` for unknown or hard-error providers. Emits a one-line stderr deprecation warning when an alias is used. Uses the resolved/normalized provider name when constructing `CopilotCritic`. |
34+
| `TryCreateAsync(CriticConfig, CancellationToken)` | Same validation as `TryCreate`, applied before the Copilot availability check. |
35+
| `IsSupported(string)` | Updated to also accept legacy aliases (returns true for `github`). Returns false for `google` and unknown values. |
36+
37+
## State Transitions
38+
39+
None.
40+
41+
## Validation Rules
42+
43+
| Rule | Source | Enforcement |
44+
|------|--------|-------------|
45+
| Provider name in canonical 5 (or alias) | spec FR-001, FR-002 | Hard: `CriticFactory.TryCreate` returns `Failed` otherwise. |
46+
| Case-insensitive | spec FR-002 | Hard: lowercase before lookup. |
47+
| `google` → hard error | spec FR-006 | Hard: `Failed` with explicit error message listing supported providers. |
48+
| `github` → soft alias to `github-models` | spec FR-005 | Soft: rewrite + stderr warning. |
49+
| Empty/null provider when enabled | spec FR-004 | Soft: fall back to default `github-models`. |
50+
51+
## Migration
52+
53+
No on-disk migration. Config files are unchanged. Users with `provider: "github"` see a deprecation warning on next run; users with `provider: "google"` see a hard error directing them to update.

0 commit comments

Comments
 (0)