Commit 826e9dc
committed
fix(v1.43.0): tolerant parser, critic logging, testimize logging, page polling
Four issues fixed in one cut, addressing user reports of:
- analysis_failed on slow models that nearly succeeded (truncated JSON)
- no visibility into Testimize lifecycle ("did it actually run?")
- progress page stuck on previous run state until manual refresh
- no way to attribute cost between analyzer / generator / critic / testimize
(1) Tolerant analysis parser:
- BehaviorAnalyzer.ParseAnalysisResponse now has a tolerant fallback path
when JsonDocument.Parse rejects the response
- New ParseTolerant walker tracks string/escape state and brace depth,
yielding every balanced {...} object inside the first array
- Partial objects at the truncation tail are silently skipped
- Returns null only when zero behaviors could be recovered
- Diagnostic data shows DeepSeek-V3.2 frequently produces 24KB+ responses
that are 95% valid but missing the closing braces (max output token
limit hit mid-stream) — those now recover instead of failing
(2) Shared DebugLogger + per-component refactor:
- New src/Spectra.CLI/Infrastructure/DebugLogger.cs with static Enabled
flag and Append(component, message) method
- BehaviorAnalyzer and CopilotGenerationAgent private DebugLog helpers
refactored to thin wrappers around the shared logger
- GenerateHandler sets DebugLogger.Enabled = config.Ai.DebugLogEnabled
once per run (both code paths)
(3) CopilotCritic.VerifyTestAsync now logs each call:
- CRITIC START test_id=X model=M timeout=Ts
- CRITIC OK test_id=X verdict=V score=N elapsed=Ts
- CRITIC TIMEOUT test_id=X model=M configured_timeout=Ts elapsed=Ts
- CRITIC ERROR test_id=X exception=T message="..." elapsed=Ts
- Per-test logging enables cost attribution: every line is one billable
critic API call. A --count 100 run produces ~100 critic lines.
(4) Critic timeout honors critic.timeout_seconds:
- Pre-v1.43.0 the runtime ignored the config field (default 30) and
hardcoded 2 minutes
- v1.43.0 honors it; default bumped from 30 → 120 to preserve the
hardcoded behavior
- Slow critic models (Sonnet, GPT-4 Turbo) can now extend it
- Two test fixtures updated (CriticConfigTests, CriticConfigLoadingTests)
(5) Testimize lifecycle logging in CopilotGenerationAgent:
- TESTIMIZE DISABLED (testimize.enabled=false in config) — always emitted
when off, so users can verify
- TESTIMIZE START command=X args=[...]
- TESTIMIZE NOT_INSTALLED command=X
- TESTIMIZE UNHEALTHY command=X
- TESTIMIZE HEALTHY command=X tools_added=2 strategy=X mode=Y
- TESTIMIZE DISPOSED (in finally block)
(6) Progress page polling on terminal pages:
- ProgressPageWriter JS used to stop polling entirely on terminal status,
so a new run's rewritten .spectra-progress.html wasn't picked up until
manual refresh
- v1.43.0: terminal pages poll at 5s intervals; in-progress pages still
poll at 1.5s
- Comment in code updated explaining the reasoning
(7) analysis_failed error message:
- Shows the actual configured analysis_timeout_minutes value (was a
hardcoded "(default 2)" string)
- Lists three diagnosis paths: low timeout, model output truncation,
schema mismatch
- Mentions the v1.43.0+ tolerant parser
Diagnostic data from real DeepSeek-V3.2 runs (from .spectra-debug.log):
- generation: 163s avg per 8-test batch, range 62-308s (~20s/test)
- analysis: 568s on first attempt (parse failed), 600s timeout on retry
- Both within model expectations — no regression
docs/configuration.md:
- Critic timeout note added
- .spectra-debug.log example expanded to show all four prefixes
([analyze ], [generate], [critic ], TESTIMIZE)
- Cost attribution section added — count START lines per component to
estimate billing
- Testimize lifecycle line table
CLAUDE.md:
- Recent Changes entry for v1.43.0 with all six fixes summarized
1551 total tests still pass (491 Core + 709 CLI + 351 MCP).1 parent 7d2d07c commit 826e9dc
File tree
11 files changed
+262
-72
lines changed- docs
- src
- Spectra.CLI
- Agent/Copilot
- Commands/Generate
- Infrastructure
- Progress
- Spectra.Core/Models/Config
- tests
- Spectra.CLI.Tests/Config
- Spectra.Core.Tests/Models/Config
11 files changed
+262
-72
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
209 | 209 | | |
210 | 210 | | |
211 | 211 | | |
| 212 | + | |
212 | 213 | | |
213 | 214 | | |
214 | 215 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
123 | 125 | | |
124 | 126 | | |
125 | 127 | | |
| |||
129 | 131 | | |
130 | 132 | | |
131 | 133 | | |
132 | | - | |
133 | | - | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
134 | 137 | | |
135 | 138 | | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
140 | 144 | | |
141 | 145 | | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
147 | 168 | | |
148 | 169 | | |
149 | 170 | | |
| |||
0 commit comments