feat(aspire): actionable diagnostics when resources fail to start#6293
Conversation
Surface why an Aspire resource failed instead of an opaque
"failed to start"/timeout with raw logs.
- Read the live CustomResourceSnapshot (ExitCode, HealthReports,
state) at failure/timeout and classify the cause: container
runtime down, image-pull failure, port-in-use, OOM (137),
non-zero exit, health-check-failing, never-started, crashed.
Each class maps to one actionable hint.
- Distinguish pending states in the timeout message: "Running but
not Healthy (<reason>)" vs "still Starting" vs "FailedToStart,
exit code N".
- Report the WaitFor dependency chain ("api waiting on 'migrations'
which Exited, exit code 1") via WaitAnnotation.
- Capture the state/health transition timeline in the background
monitor and fold it into the diagnostics; extend the monitor over
the resource-wait phase so health-wait hangs are covered.
- Keep the exception message concise; write the full timeline plus
untruncated per-resource logs to an attached Artifact (test, with
session fallback).
Also: add Exited to the fail-fast watch (clean crash-exits no longer
wait the full timeout); lock LogProgress (shared static stderr stream
spliced under concurrent writers); fix LogLine rendering to use
.Content instead of the noisy generated record ToString().
18 pure no-Docker unit tests in AspireDiagnosticsTests; builds on
net8.0/net9.0/net10.0.
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| ErrorProne | 1 medium |
🟢 Metrics 21 complexity
Metric Results Complexity 21
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
This is a well-designed improvement to Aspire startup diagnostics. The separation of pure classification logic (Classify, ScanLogSignatures, HintFor, DescribeState) from side-effectful I/O makes the code testable and the 18 unit tests are exactly the right kind to have here. The real bug fixes (LogLine .Content, stderr locking, Exited in fail-fast, monitor lifetime spanning both phases) are all valid. A few concerns worth raising:
High: Exited in fail-fast may trigger false failures on short-lived resources
WaitForResourcesWithFailFastAsync now watches for both FailedToStart and Exited:
string[] failureStates = [KnownResourceStates.FailedToStart, KnownResourceStates.Exited];If a short-lived resource (e.g. a DB migration runner or seeder) exits successfully with code 0, it may transition to Exited state — and would immediately trigger the failure path, even though it succeeded. In Aspire, Finished is typically used for cleanly-exited executable resources, but Exited can be ambiguous. A failed test output saying "Resource 'migrations' failed to start" for a seeder that ran fine would be very confusing.
Suggested improvement: Check exit code before treating Exited as a failure, or only fail-fast on Exited when the exit code is non-zero/null:
// Inside the failure-detection task:
await notificationService.WaitForResourceAsync(name, evt =>
evt.Snapshot.State?.Text == KnownResourceStates.FailedToStart ||
(evt.Snapshot.State?.Text == KnownResourceStates.Exited && evt.Snapshot.ExitCode is not (null or 0)),
failureCts.Token);This preserves the intent (catch a crash-exited container) without catching a successfully-completed job.
Medium: Sequential log collection compounds timeout on multi-resource failures
BuildDiagnosticsAndAttachAsync collects logs for each problem resource sequentially:
foreach (var name in problemResources)
{
var logLines = await CollectResourceLogLinesAsync(app, name); // 5s timeout each
...
}With 3 failing resources, that's up to 15 seconds added to an already-failed test's teardown. Consider collecting all resources in parallel:
var logsByName = await Task.WhenAll(problemResources.Select(async name =>
(name, logs: await CollectResourceLogLinesAsync(app, name))));Medium: SummarizeDependencies reports only the first dependency
The concise exception message (SummarizeDependencies) exits on the very first wait annotation, so a resource waiting on multiple dependencies only shows one in the short message. The full artifact (via AppendDependencyChain) correctly shows all. Consider returning a comma-joined summary or the first unresolved dependency specifically:
// Show all pending deps instead of just the first
var pendingDeps = waits
.Where(w => /* dep not yet in ready state */)
.Select(w => FormatDep(w))
.ToList();
return pendingDeps.Count == 0 ? null : string.Join(", ", pendingDeps);Low: Timeline cap drops the oldest (most diagnostic) transitions
RecordTransition trims from the front when the timeline exceeds 200 entries:
while (_timeline.Count > 200 && _timeline.TryDequeue(out _))
{
}For a flapping resource, this drops the initial startup transitions that show what happened first — the most useful information for diagnosing a cascading failure. If bounding is needed, consider keeping the first N and last N entries instead of a pure sliding window.
Low: FormatTimeline silently returns empty on single transitions
if (transitions.Length < 2) return string.Empty;A single recorded state transition (e.g. resource immediately jumped to FailedToStart) is still useful to see. The < 2 guard loses this. The concern about "nothing to show" doesn't apply when there's one real transition; the guard makes more sense as == 0.
Low: DateTime.Now in artifact filenames — use UtcNow
var fileName = $"aspire-diagnostics-{typeof(TAppHost).Name}-{DateTime.Now:yyyyMMdd-HHmmss-fff}.log";CI runners typically run in UTC, but DateTime.Now is timezone-dependent. Using DateTime.UtcNow makes artifact names consistent across environments.
Low: Bare catch {} is too broad
TryWriteDiagnosticsArtifactAsync catches all exceptions including OutOfMemoryException and ThreadAbortException. The comment ("diagnostics must never mask the underlying startup failure") is correct in spirit, but prefer catch (Exception ex) with at least a Debug.WriteLine or Trace.Write so the failure isn't entirely silent during development:
catch (Exception ex) when (ex is not OutOfMemoryException)
{
// diagnostics must never mask the underlying startup failure
}Nice-to-have: HealthSignature is per-resource but doesn't include report names
The health signature used for change detection is just the worst-status string ("Unhealthy", "Healthy", etc.). If two health checks for the same resource flip independently but cancel out at the aggregate level (one gets better, one gets worse), the change is invisible. For the timeline this is probably fine (it'll detect most changes), but a fuller signature like string.Join(",", reports.OrderBy(r => r.Name).Select(r => $"{r.Name}:{r.Status}")) would be more precise.
Summary
The PR's architecture is solid — the pure-function/side-effect separation is clean, the artifact attachment approach is practical, and the real bug fixes are correct. The main concern is the Exited fail-fast behavior, which could cause confusing false failures for short-lived successful resources. The other issues are incremental quality improvements.
- Fail-fast on Exited only when exit code is non-zero, so a one-shot resource that exits cleanly (code 0) isn't mis-reported as a failure (switch to the predicate WaitForResourceAsync overload). - Collect per-resource logs in parallel instead of sequentially, so multiple failing resources don't compound the collection timeout. - SummarizeDependencies reports the first *unsatisfied* dependency (the one actually blocking), not just the first declared one. - Timeline cap now keeps both head and tail (drops only the middle churn) so the initial startup story survives a flapping resource. - FormatTimeline renders a single transition (immediate failures are still worth showing); only an empty timeline yields "". - HealthSignature is per-check (name:status) so independent checks flipping in opposite directions still register, and the timeline shows which check moved. - Artifact filename uses UtcNow; artifact-write failures are logged via LogProgress rather than silently swallowed. 19 unit tests pass; builds on net8.0/net9.0/net10.0.
|
Thanks for the review — addressed in 6b36d0a. Verified each against the code; all were valid:
19 unit tests pass; builds on net8.0/net9.0/net10.0. |
There was a problem hiding this comment.
Code Review: feat(aspire): actionable diagnostics when resources fail to start
Overview
This PR transforms opaque Aspire startup failures into actionable diagnostics. The core design — classify → hint → describe → attach — is clean and well-structured. The 18 unit tests covering the pure logic are a good foundation, the bug fixes (log line ToString(), StdErr locking, Exited fail-fast) are all correct, and the parallel log collection via Task.WhenAll is the right call.
No blocking issues. A few things worth noting:
Issue 1: Artifact filename collision in parallel test runs
File: AspireFixture.cs — TryWriteDiagnosticsArtifactAsync
var fileName = $"aspire-diagnostics-{typeof(TAppHost).Name}-{DateTime.UtcNow:yyyyMMdd-HHmmss-fff}.log";Two fixture instances with the same TAppHost type that time out within the same millisecond will race to write the same path. One will silently overwrite the other's artifact, and the first test's artifact pointer will point at corrupted/wrong content.
Suggested fix:
var fileName = $"aspire-diagnostics-{typeof(TAppHost).Name}-{DateTime.UtcNow:yyyyMMdd-HHmmss-fff}-{Guid.NewGuid():N}.log";This is low-probability but the fix is one token and eliminates the class of error entirely.
Issue 2: Double TryGetCurrentState lookup for the same resource
File: AspireFixture.cs — DiagnoseResource and AppendResourceDiagnosticBlock
Both methods independently call notificationService.TryGetCurrentState(name, ...). The snapshot is fetched twice, which is fine if TryGetCurrentState is a cheap dictionary lookup, but it also means the two calls could observe different state if a transition happens between them (unlikely during diagnostics, but technically possible).
Suggested refactor: Capture the snapshot once in DiagnoseResource and thread it through to AppendResourceDiagnosticBlock. The state divergence risk is low, but the architectural clarity is higher — the diagnostic block should be rendering a snapshot, not re-querying live state mid-render.
// In ResourceDiagnosis record, add the snapshot:
internal readonly record struct ResourceDiagnosis(
string Name,
string State,
int? ExitCode,
ResourceFailureClass Class,
string? Detail,
string? Hint,
ResourceEvent? Event); // carry the event for downstream renderingIssue 3: Timeline reads can race with drain/refill during the failure path
File: AspireFixture.cs — RecordTransition and BuildDiagnosticsAndAttachAsync
The comment on RecordTransition says "reads happen after it stops", but FormatTimeline() is called from BuildDiagnosticsAndAttachAsync, which is called from within the try block — before StopMonitorAsync runs in the finally. This means _timeline.ToArray() (in FormatTimeline) can execute concurrently with the drain-refill loop:
Thread A (monitor): TryDequeue loop draining → queue empty → not yet re-enqueued
Thread B (timeout): FormatTimeline → _timeline.ToArray() → sees [] ← stale empty snapshot
Thread A: re-enqueues head + tail
The result is a timeline that shows as empty in the diagnostic output for the failure case. Since this is diagnostic-only and ConcurrentQueue.ToArray() is itself thread-safe, this won't cause any crash or incorrect exception. But it means the most useful artifact (the timeline showing what states the resource cycled through) can silently be empty exactly when it matters most.
Suggested fix: Stop the monitor before calling BuildDiagnosticsAndAttachAsync, or simply call StopMonitorAsync before building diagnostics in the failure paths. The small delay (waiting for the monitor task to observe the cancel) is worth the guarantee that the timeline is complete.
Issue 4: RecordTransition trimming is untested
The head/tail compaction logic in RecordTransition is the most algorithmically novel part of the PR and has a non-obvious invariant (the just-enqueued item must end up in the tail). This is the one piece of code that isn't covered by the 18 unit tests. A test with > 400 transitions asserting that the head (first N) and tail (last N) survive compaction would be a good addition.
Strengths worth calling out
- Classification pipeline (
Classify → HintFor → DescribeState) is the right abstraction — it's pure, testable, and composable. The unit tests validate each layer independently. - Parallel log collection with
Task.WhenAllis the correct approach; sequential collection with per-resource 5s timeouts would compound badly. Exitedfail-fast fix (only triggering on non-zero exit) is a meaningful correctness improvement — migration runners and seeders that exit cleanly were wrongly mis-classified before.StdErrlocking is a real bug fix; the race was always there with multi-fixture runs.line.Contentbug fix is a clear correctness improvement.- Artifact fallback to
TestSessionContextis a thoughtful touch for when there's no activeTestContext.
Overall this is solid work. The issues above are all low-severity relative to the value the PR delivers.
- Guard the timeline with a lock (List instead of ConcurrentQueue): the background monitor records into it while a failing path reads it (the monitor is stopped in the finally, after diagnostics are built), so the previous drain/refill could expose a transient empty timeline exactly when it mattered. - Extract TrimTimeline as a pure, unit-tested helper (keeps head + tail, drops the middle). - Capture each resource's snapshot once and thread the ResourceEvent through DiagnoseResource and AppendResourceDiagnosticBlock instead of re-querying live state mid-render. - Add a GUID to the artifact filename so two fixtures of the same TAppHost timing out in the same millisecond can't overwrite each other's artifact. 21 unit tests pass; builds on net8.0/net9.0/net10.0.
|
Second review pass addressed in d21fab6 — all four were valid:
21 unit tests pass; builds on net8.0/net9.0/net10.0. |
There was a problem hiding this comment.
Code Review — Third Pass
All issues from both previous review rounds have been correctly addressed. Here's a verification pass on each fix, plus a look at the final state.
Previous Issues — All Resolved ✅
Round 1 fixes (6b36d0a):
Exitedfalse-positives:IsFailureStatecorrectly gates onsnapshot.ExitCode is not (null or 0)— migration runners exiting cleanly with code 0 are excluded. ✅- Sequential log collection: now
Task.WhenAllover all problem resources. ✅ SummarizeDependenciesnow skips ready deps and surfaces the first unresolved blocking dependency. ✅- Timeline cap: head/tail compaction (first 80 + last 120) instead of pure sliding window. ✅
FormatTimelineguard relaxed from< 2to== 0— single transitions now render. ✅- Artifact filename:
DateTime.UtcNow✅ - Bare
catch {}: nowcatch (Exception ex)withLogProgresssurfacing the failure. ✅ HealthSignature: now a per-checkname:statusstring, so independent checks flipping in opposite directions both register. ✅
Round 2 fixes (d21fab6):
- Artifact filename collision:
{Guid.NewGuid():N}added — same-TAppHost fixtures can't race to the same path. ✅ - Double
TryGetCurrentState: snapshot is fetched once inBuildDiagnosticsAndAttachAsyncand threaded through to bothDiagnoseResourceandAppendResourceDiagnosticBlockas a parameter — no mid-render re-query. ✅ - Timeline read/drain race:
_timelineis now a plainList<StateTransition>guarded by_timelineGate; bothRecordTransitionandFormatTimeline()take the lock, givingFormatTimelinea consistent snapshot via_timeline.ToArray()under the lock. ✅ TrimTimelineextracted asinternal staticand covered by two unit tests (400→200 with head+tail preserved; under-cap unchanged). ✅
Final State: Clean
The architecture is sound end-to-end:
- Pure classification pipeline (
Classify → HintFor → DescribeState) — all tested independently, no I/O dependencies. StateTransitionrecord is correctly defined asinternal readonly record structwith positional constructor, accessible viausing staticin the test file.AppendResourceDiagnosticBlockreceives the pre-fetchedResourceEvent? evas a parameter alongside theResourceDiagnosis— a cleaner alternative to embedding the event in the record struct that keeps the diagnostic model free of raw Aspire API types.IsFailureStategate is precise:FailedToStart(always) orExitedwith non-zero/non-null exit code (crash exit without FailedToStart marker). Clean one-shot resources are unaffected.- 21 unit tests covering classification, hints, descriptions, log-signature scanning, timeline formatting, and timeline compaction — all pure-function tests with no Docker dependency.
No new issues found. This is ready to merge.
Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.55.2 to 1.56.25. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.56.25 <!-- Release notes generated using configuration in .github/release.yml at v1.56.25 --> ## What's Changed ### Other Changes * feat(aspire): actionable diagnostics when resources fail to start by @thomhurst in thomhurst/TUnit#6293 ### Dependencies * chore(deps): update tunit to 1.56.18 by @thomhurst in thomhurst/TUnit#6285 * chore(deps): update aspire to 13.4.6 by @thomhurst in thomhurst/TUnit#6287 * chore(deps): update dependency mockolate to 3.3.0 by @thomhurst in thomhurst/TUnit#6291 **Full Changelog**: thomhurst/TUnit@v1.56.18...v1.56.25 ## 1.56.18 <!-- Release notes generated using configuration in .github/release.yml at v1.56.18 --> ## What's Changed ### Other Changes * fix(mocks): mock deep interface hierarchies (#6264) by @thomhurst in thomhurst/TUnit#6283 * feat: artifact retention for auto-uploaded HTML report (#6266) by @thomhurst in thomhurst/TUnit#6270 * fix(mocks): forward asymmetric `new`-hidden property slots per-accessor (#6263) by @thomhurst in thomhurst/TUnit#6281 * fix: honor OverloadResolutionPriority on net8 consumers (#6276, #6280) by @thomhurst in thomhurst/TUnit#6282 ### Dependencies * chore(deps): update tunit to 1.56.0 by @thomhurst in thomhurst/TUnit#6259 * chore(deps): update dependency streamjsonrpc to 2.25.29 by @thomhurst in thomhurst/TUnit#6258 * chore(deps): update aspire to 13.4.5 by @thomhurst in thomhurst/TUnit#6267 * chore(deps): bump launch-editor from 2.12.0 to 2.14.1 in /docs by @dependabot[bot] in thomhurst/TUnit#6268 * chore(deps): bump @babel/core from 7.28.5 to 7.29.7 in /docs by @dependabot[bot] in thomhurst/TUnit#6269 * chore(deps): update dependency dompurify to v3.4.11 by @thomhurst in thomhurst/TUnit#6271 * chore(deps): update dependency serialize-javascript to v7.0.6 by @thomhurst in thomhurst/TUnit#6272 * chore(deps): update actions/checkout action to v7 by @thomhurst in thomhurst/TUnit#6277 * chore(deps): update verify to 31.20.0 by @thomhurst in thomhurst/TUnit#6278 * chore(deps): bump webpack-dev-server from 5.2.4 to 5.2.5 in /docs by @dependabot[bot] in thomhurst/TUnit#6273 **Full Changelog**: thomhurst/TUnit@v1.56.0...v1.56.18 ## 1.56.0 <!-- Release notes generated using configuration in .github/release.yml at v1.56.0 --> ## What's Changed ### Other Changes * fix(aspnetcore): serialize WithWebHostBuilder to stop _derivedFactories race (flaky disposal NRE) by @thomhurst in thomhurst/TUnit#6251 * fix(mocks): wrap a real object whose class has no parameterless ctor (#6253) by @thomhurst in thomhurst/TUnit#6255 * fix(mocks): implement `new`-hidden base interface members in wrapper (#6252) by @thomhurst in thomhurst/TUnit#6256 * fix(mocks): mocking a method with more params than Func/Action arity (#6254) by @thomhurst in thomhurst/TUnit#6257 ### Dependencies * chore(deps): update tunit to 1.55.2 by @thomhurst in thomhurst/TUnit#6248 * chore(deps): update aspire to 13.4.4 by @thomhurst in thomhurst/TUnit#6249 * chore(deps): update dependency stackexchange.redis to v3 by @thomhurst in thomhurst/TUnit#6250 **Full Changelog**: thomhurst/TUnit@v1.55.2...v1.56.0 Commits viewable in [compare view](thomhurst/TUnit@v1.55.2...v1.56.25). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Updated [TUnit.Core](https://github.com/thomhurst/TUnit) from 1.55.2 to 1.56.25. <details> <summary>Release notes</summary> _Sourced from [TUnit.Core's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.56.25 <!-- Release notes generated using configuration in .github/release.yml at v1.56.25 --> ## What's Changed ### Other Changes * feat(aspire): actionable diagnostics when resources fail to start by @thomhurst in thomhurst/TUnit#6293 ### Dependencies * chore(deps): update tunit to 1.56.18 by @thomhurst in thomhurst/TUnit#6285 * chore(deps): update aspire to 13.4.6 by @thomhurst in thomhurst/TUnit#6287 * chore(deps): update dependency mockolate to 3.3.0 by @thomhurst in thomhurst/TUnit#6291 **Full Changelog**: thomhurst/TUnit@v1.56.18...v1.56.25 ## 1.56.18 <!-- Release notes generated using configuration in .github/release.yml at v1.56.18 --> ## What's Changed ### Other Changes * fix(mocks): mock deep interface hierarchies (#6264) by @thomhurst in thomhurst/TUnit#6283 * feat: artifact retention for auto-uploaded HTML report (#6266) by @thomhurst in thomhurst/TUnit#6270 * fix(mocks): forward asymmetric `new`-hidden property slots per-accessor (#6263) by @thomhurst in thomhurst/TUnit#6281 * fix: honor OverloadResolutionPriority on net8 consumers (#6276, #6280) by @thomhurst in thomhurst/TUnit#6282 ### Dependencies * chore(deps): update tunit to 1.56.0 by @thomhurst in thomhurst/TUnit#6259 * chore(deps): update dependency streamjsonrpc to 2.25.29 by @thomhurst in thomhurst/TUnit#6258 * chore(deps): update aspire to 13.4.5 by @thomhurst in thomhurst/TUnit#6267 * chore(deps): bump launch-editor from 2.12.0 to 2.14.1 in /docs by @dependabot[bot] in thomhurst/TUnit#6268 * chore(deps): bump @babel/core from 7.28.5 to 7.29.7 in /docs by @dependabot[bot] in thomhurst/TUnit#6269 * chore(deps): update dependency dompurify to v3.4.11 by @thomhurst in thomhurst/TUnit#6271 * chore(deps): update dependency serialize-javascript to v7.0.6 by @thomhurst in thomhurst/TUnit#6272 * chore(deps): update actions/checkout action to v7 by @thomhurst in thomhurst/TUnit#6277 * chore(deps): update verify to 31.20.0 by @thomhurst in thomhurst/TUnit#6278 * chore(deps): bump webpack-dev-server from 5.2.4 to 5.2.5 in /docs by @dependabot[bot] in thomhurst/TUnit#6273 **Full Changelog**: thomhurst/TUnit@v1.56.0...v1.56.18 ## 1.56.0 <!-- Release notes generated using configuration in .github/release.yml at v1.56.0 --> ## What's Changed ### Other Changes * fix(aspnetcore): serialize WithWebHostBuilder to stop _derivedFactories race (flaky disposal NRE) by @thomhurst in thomhurst/TUnit#6251 * fix(mocks): wrap a real object whose class has no parameterless ctor (#6253) by @thomhurst in thomhurst/TUnit#6255 * fix(mocks): implement `new`-hidden base interface members in wrapper (#6252) by @thomhurst in thomhurst/TUnit#6256 * fix(mocks): mocking a method with more params than Func/Action arity (#6254) by @thomhurst in thomhurst/TUnit#6257 ### Dependencies * chore(deps): update tunit to 1.55.2 by @thomhurst in thomhurst/TUnit#6248 * chore(deps): update aspire to 13.4.4 by @thomhurst in thomhurst/TUnit#6249 * chore(deps): update dependency stackexchange.redis to v3 by @thomhurst in thomhurst/TUnit#6250 **Full Changelog**: thomhurst/TUnit@v1.55.2...v1.56.0 Commits viewable in [compare view](thomhurst/TUnit@v1.55.2...v1.56.25). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Why
When an Aspire resource fails to start under TUnit, the failure was hard to diagnose: the fixture only read
State.Text, so a user got an opaque"Resource 'x' failed to start"/ timeout plus raw logs — no exit code, no health-check reason, no clue which dependency was blocking, and the state-transition timeline went only to stderr and was lost.This makes the failure itself answer what's wrong and what to do next, and routes the detail into the test report.
What changed (all in
TUnit.Aspire.Core/AspireFixture.cs)CustomResourceSnapshot(ExitCode,HealthReports, state) and classify the cause: container-runtime-down, image-pull failure, port-in-use, OOM (exit 137), non-zero exit, health-check-failing, never-started, crashed-no-code. Each class maps to one actionable hint."Running but not Healthy (health 'ready' Unhealthy: connection refused)"vs"still Starting (never reached Running)"vs"FailedToStart, exit code 137 (OOM)", derived from each resource's current snapshot."Waiting"into"api waiting on 'migrations' (Exited, exit code 1)"viaWaitAnnotation.StartAsyncand the resource-wait phase so health-wait hangs are covered.Artifact(on the failing test, withTestSessionContextfallback), so it shows up in trx/HTML/IDE.Incidental fixes
Exitedto the fail-fast watch — a container that crash-exits cleanly never reportsFailedToStart, so previously we waited the full timeout for an already-dead resource.LogProgress— the background monitor, the main init thread, and multiple fixture instances all write the shared static stderr stream; without a lock, lines splice.LogLinerendering —$" {line}"emitted the noisy generated recordToString()(LogLine { LineNumber = .. }); now uses.Content(also inWatchResourceLogs).Notes
private/internal. Both source-gen and reflection modes; Native-AOT safe (public property reads only, no member reflection).HealthStatusisnullunless state ==Running, so diagnosis readsHealthReportsdirectly.Testing
TUnit.Aspire.Tests/AspireDiagnosticsTests.cs(classification, hints, descriptions, log-signature scan, timeline formatting) — all pass.