Skip to content

Core: Agentic observability for vitest and ghost stories#34537

Merged
yannbf merged 27 commits intoproject/sb-agentic-setupfrom
sidnioulz/agentic-telemetry-ws2
Apr 15, 2026
Merged

Core: Agentic observability for vitest and ghost stories#34537
yannbf merged 27 commits intoproject/sb-agentic-setupfrom
sidnioulz/agentic-telemetry-ws2

Conversation

@yannbf
Copy link
Copy Markdown
Member

@yannbf yannbf commented Apr 14, 2026

Closes #

What I did

This PR does the following:

  • Adds a telemetry utility to detect whether the run is at a specific first session window of a particular event
  • Adds a Vitest reporter which uses that
  • Improves error categorization
  • Adds some cleanup about ghost stories

Checklist for Contributors

Testing

The changes in this PR are covered in the following automated tests:

  • stories
  • unit tests
  • integration tests
  • end-to-end tests

Manual testing

Caution

This section is mandatory for all contributions. If you believe no manual test is necessary, please state so explicitly. Thanks!

Documentation

  • Add or update documentation reflecting your changes
  • If you are deprecating/removing a feature, make sure to update
    MIGRATION.MD

Checklist for Maintainers

  • When this PR is ready for testing, make sure to add ci:normal, ci:merged or ci:daily GH label to it to run a specific set of sandboxes. The particular set of sandboxes can be found in code/lib/cli-storybook/src/sandbox-templates.ts

  • Make sure this PR contains one of the labels below:

    Available labels
    • bug: Internal changes that fixes incorrect behavior.
    • maintenance: User-facing maintenance tasks.
    • dependencies: Upgrading (sometimes downgrading) dependencies.
    • build: Internal-facing build tooling & test updates. Will not show up in release changelog.
    • cleanup: Minor cleanup style change. Will not show up in release changelog.
    • documentation: Documentation only changes. Will not show up in release changelog.
    • feature request: Introducing a new feature.
    • BREAKING CHANGE: Changes that break compatibility in some way with current major version.
    • other: Changes that don't fit in the above categories.

🦋 Canary release

This pull request has been released as version 0.0.0-pr-34537-sha-cd66a6a9. Try it out in a new sandbox by running npx storybook@0.0.0-pr-34537-sha-cd66a6a9 sandbox or in an existing project with npx storybook@0.0.0-pr-34537-sha-cd66a6a9 upgrade.

More information
Published version 0.0.0-pr-34537-sha-cd66a6a9
Triggered by @yannbf
Repository storybookjs/storybook
Branch sidnioulz/agentic-telemetry-ws2
Commit cd66a6a9
Datetime Tue Apr 14 13:34:30 UTC 2026 (1776173670)
Workflow run 24401970008

To request a new release of this pull request, mention the @storybookjs/core team.

core team members can create a new canary release here or locally with gh workflow run --repo storybookjs/storybook publish.yml --field pr=34537

Summary by CodeRabbit

  • New Features

    • Agent-aware telemetry for test runs with aggregated metrics, durations, and session-aware emissions
    • Story-level test conversion and analysis utilities
  • Improvements

    • More detailed test-run analysis (success rates, empty-render accounting, unique error counts)
    • Broader render/hook error detection and categorization
    • Render-analysis enabled in more scenarios; reporter state resets between runs
    • CLI event-log: include/exclude filtering and metadata masking
  • Tests

    • New and expanded tests for analysis, telemetry, conversion, and error-categorization logic

@yannbf yannbf self-assigned this Apr 14, 2026
@yannbf yannbf added maintenance User-facing maintenance tasks ci:normal labels Apr 14, 2026
@nx-cloud
Copy link
Copy Markdown

nx-cloud bot commented Apr 14, 2026

View your CI Pipeline Execution ↗ for commit e81a5db

Command Status Duration Result
nx run-many -t compile,check,knip,test,lint,fmt... ✅ Succeeded 7m 26s View ↗

☁️ Nx Cloud last updated this comment at 2026-04-15 13:29:42 UTC

@nx-cloud
Copy link
Copy Markdown

nx-cloud bot commented Apr 14, 2026

View your CI Pipeline Execution ↗ for commit 0834932


☁️ Nx Cloud last updated this comment at 2026-04-14 12:22:32 UTC

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces shared story-test types and analysis utilities, adds Vitest AgentTelemetryReporter that collects per-story results and emits telemetry, renames/extends the ghost-story runner to runStoryTests with ghostRun option, broadens error categorization, and wires session-aware render-analysis gating and telemetry helpers.

Changes

Cohort / File(s) Summary
A11y / Preview
code/addons/a11y/src/preview.tsx
Added inline comment that a11y checks skip ghost-story runs (no logic change).
Vitest reporter & tests
code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.ts, code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts
New AgentTelemetryReporter collecting per-story Vitest results, converting with toStoryTestResult, aggregating via analyzeTestResults, emitting telemetry; tests cover collection, filtering (example stories), aggregation, and run isolation.
Vitest plugin integration
code/addons/vitest/src/vitest-plugin/index.ts
Added withinAgenticSetupSession detection, enable globals.renderAnalysis.enabled during agentic setup, made configureVitest async, and conditionally inject AgentTelemetryReporter when appropriate.
Parse & run story tests
code/core/src/core-server/utils/ghost-stories/parse-vitest-report.ts, .../parse-vitest-report.test.ts, .../run-story-tests.ts
Replaced local parsing/aggregation with per-assertion toStoryTestResult mapping and delegated run-level analysis to analyzeTestResults; renamed runner to runStoryTests and added ghostRun option controlling env and cache key.
Ghost-stories annotations & types
code/core/src/core-server/utils/ghost-stories/test-annotations.ts, .../types.ts
Gated afterEach on globals.renderAnalysis?.enabled; removed local story/test types in favor of shared TestRunAnalysis types.
Shared test-result types & analyzers
code/core/src/shared/utils/test-result-types.ts, code/core/src/shared/utils/analyze-test-results.ts, code/core/src/shared/utils/analyze-test-results.test.ts
Added StoryTestResult, categorization types and TestRunAnalysis; implemented extractCategorizedErrors and analyzeTestResults with unit tests for totals, rates, empty-render handling, unique error counts, and categorized errors.
To-story-test-result utils & tests
code/core/src/shared/utils/to-story-test-result.ts, .../to-story-test-result.test.ts
New utilities to convert Vitest-like inputs to StoryTestResult, detect empty renders, and extract/clean error messages; tests added.
Error categorization patterns
code/core/src/shared/utils/categorize-render-errors.ts, .../categorize-render-errors.test.ts
Broadened matching for hook/render/provider/portal/component errors and extended unit tests with additional variants.
Core-server exports & channels
code/core/src/core-server/index.ts, .../server-channel/ai-setup-channel.ts, .../server-channel/ghost-stories-channel.ts, .../server-channel/ghost-stories-channel.test.ts
Re-exported analysis utilities/types from shared module; updated channels and tests to use runStoryTests and adjusted cache path usage and invocation to pass { ghostRun: true } where applicable.
Telemetry: session helper & types
code/core/src/telemetry/event-cache.ts, code/core/src/telemetry/index.ts, code/core/src/telemetry/types.ts
Added isWithinInitialSession(events) helper and re-exported it; added 'ai-setup-self-healing-scoring' event literal to EventType.
AI setup utils telemetry callsite
code/core/src/telemetry/ai-setup-utils.ts, code/core/src/core-server/ai-setup-channel.ts
Switched to runStoryTests import with { ghostRun: true }; deferred telemetry payload construction by passing an async factory to telemetry.
Misc / Scripts / Config
scripts/tsconfig.json, scripts/eval/lib/grade.ts, scripts/event-log-collector.ts
Removed duplicate tsconfig option; updated grading to call runStoryTests; enhanced event-log-collector with include/exclude filters and --no-metadata.

Sequence Diagram(s)

sequenceDiagram
    participant Vitest as Vitest Runner
    participant Reporter as AgentTelemetryReporter
    participant Analyzer as analyzeTestResults
    participant Telemetry as Telemetry Service
    participant Files as Test Modules / FS

    Vitest->>Reporter: onInit(ctx)
    Reporter->>Reporter: store context & startTime

    Vitest->>Reporter: onTestCaseResult(testCase)
    Reporter->>Reporter: toStoryTestResult(testCase) -> append StoryTestResult

    Vitest->>Reporter: onTestRunEnd(testModules, unhandledErrors)
    Reporter->>Files: gather module errors
    Reporter->>Analyzer: analyzeTestResults(testResults)
    Analyzer-->>Reporter: TestRunAnalysis
    Reporter->>Telemetry: telemetry('ai-setup-self-healing-scoring', { agent, analysis, counts, duration, watch })
    Telemetry-->>Reporter: ack
    Reporter->>Reporter: reset internal buffers
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts (2)

5-13: Consider using spy: true for module mocks per coding guidelines.

The coding guidelines recommend using vi.mock() with spy: true option. While this mock intentionally replaces the module, consider if a spy-based approach would work:

🔧 Alternative approach using spy: true
-vi.mock('storybook/internal/telemetry', () => ({
-  telemetry: vi.fn(),
-  isExampleStoryId: vi.fn(
-    (id: string) =>
-      id.startsWith('example-button--') ||
-      id.startsWith('example-header--') ||
-      id.startsWith('example-page--')
-  ),
-}));
+vi.mock('storybook/internal/telemetry', { spy: true });

Then configure implementations in beforeEach:

beforeEach(() => {
  vi.clearAllMocks();
  vi.mocked(telemetry).mockImplementation(vi.fn());
  vi.mocked(isExampleStoryId).mockImplementation(
    (id: string) =>
      id.startsWith('example-button--') ||
      id.startsWith('example-header--') ||
      id.startsWith('example-page--')
  );
  // ...
});

As per coding guidelines: "Use vi.mock() with the spy: true option for all package and file mocks in Vitest tests."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts` around
lines 5 - 13, The test currently replaces the entire
'storybook/internal/telemetry' module via vi.mock(...) with concrete
implementations for telemetry and isExampleStoryId; per guidelines switch to
using vi.mock(..., { spy: true }) and move concrete implementations into the
test setup (e.g., beforeEach) by clearing mocks (vi.clearAllMocks()) and
assigning behavior with vi.mocked(telemetry).mockImplementation(...) and
vi.mocked(isExampleStoryId).mockImplementation(...); update the existing mocked
symbols telemetry and isExampleStoryId accordingly so the module is spied
instead of fully replaced.

118-142: Consider adding explicit assertions to onTestCaseResult tests.

These tests call onTestCaseResult but have no assertions, only verifying the method doesn't throw. The actual collection behavior is implicitly tested via onTestRunEnd tests, but explicit assertions would improve clarity.

For example, you could verify collected results count after each call if the reporter exposes internal state, or add comments explaining these are smoke tests for no-throw behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts` around
lines 118 - 142, The tests for onTestCaseResult currently only call
reporter.onTestCaseResult(...) without assertions; update each it block to
assert the reporter's collection behavior by checking an exposed result store
(e.g., call reporter.getCollectedResults() or inspect reporter._results) after
the call to verify the expected count or contents for the given storyId, and for
skipped cases assert that the store is unchanged or empty; if the reporter does
not expose state, either add a test-only accessor method to the reporter or
assert behavior via the next step (e.g., spy/stub the telemetry send function or
verify onTestRunEnd results) and otherwise add a brief comment clarifying the
test is intentionally a smoke/no-throw check.
code/core/src/shared/utils/analyze-test-results.ts (1)

40-50: Consider using a more specific type instead of Record<string, any>.

The categorizedErrors reduction uses Record<string, any> which loses type information. The structure is well-defined and could benefit from explicit typing.

🔧 Suggested type improvement
-  const categorizedErrors = Array.from(map.entries()).reduce<Record<string, any>>(
+  const categorizedErrors = Array.from(map.entries()).reduce<
+    Record<string, { uniqueCount: number; count: number; matchedDependencies: string[] }>
+  >(
     (acc, [category, data]) => {
       acc[category] = {
         uniqueCount: data.uniqueErrors.size,
         count: data.count,
         matchedDependencies: Array.from(data.matchedDependencies).sort(),
       };
       return acc;
     },
     {}
   );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/core/src/shared/utils/analyze-test-results.ts` around lines 40 - 50, The
reduction that builds categorizedErrors should use an explicit interface instead
of Record<string, any>; define a CategorySummary type/interface (e.g., {
uniqueCount: number; count: number; matchedDependencies: string[] }) and change
the reducer's generic from Record<string, any> to Record<string,
CategorySummary> (or to a TypedMap alias) so categorizedErrors, the reduce
initial type, and the mapped properties (uniqueCount, count,
matchedDependencies) are strongly typed; add the new type near the top of
analyze-test-results.ts and update the reduce call signature to use it.
code/core/src/shared/utils/analyze-test-results.test.ts (1)

6-6: Add explicit .ts extension to the mock path.

Per coding guidelines, relative imports should use explicit file extensions.

🔧 Suggested fix
-vi.mock('./categorize-render-errors', { spy: true });
+vi.mock('./categorize-render-errors.ts', { spy: true });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/core/src/shared/utils/analyze-test-results.test.ts` at line 6, Replace
the mock import path used in the test so it includes the explicit TypeScript
extension: update the vi.mock call that references './categorize-render-errors'
in analyze-test-results.test.ts to use './categorize-render-errors.ts' (keep the
existing options like { spy: true } unchanged) to follow the project's
relative-import extension guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code/addons/a11y/src/preview.tsx`:
- Around line 23-26: The current check treats any globals.renderAnalysis object
as internal mode (isInternalRenderAnalysis = !!globals.renderAnalysis), which
wrongly suppresses a11y when renderAnalysis exists but is disabled; change
isInternalRenderAnalysis to check the enabled flag (e.g., use
!!globals.renderAnalysis?.enabled) and update the dependent condition
(shouldRunEnvironmentIndependent) to rely on that so a11y is only gated when
renderAnalysis.enabled is true; locate and update the symbols
isInternalRenderAnalysis, globals.renderAnalysis, and
shouldRunEnvironmentIndependent in preview.tsx.

In `@code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.ts`:
- Around line 57-58: Move the test-duration timer start from the end of runs to
the beginning: remove or stop resetting this.startTime in onTestRunEnd() and
instead set this.startTime = Date.now() inside onTestRunStart() so each run
measures only execution time (reference symbols: this.startTime,
onTestRunStart(), onTestRunEnd()).

In `@code/core/src/telemetry/event-cache.ts`:
- Around line 121-127: The current selection logic using
eventTypes.map(...).find(...) finds the first available event by array order
rather than the chronologically latest; update the logic that computes
lastRelevantEvent to choose the event with the greatest timestamp from
lastEvents for the given eventTypes (mirror the approach used in lastEvent()
which sorts by timestamp descending or compute a max by timestamp).
Specifically, replace the map+find sequence with code that collects
lastEvents[type] for each type in eventTypes, filters out undefined, then picks
the item with the largest timestamp (or sorts descending and takes [0]) so
lastRelevantEvent truly represents the most recent occurrence.

---

Nitpick comments:
In `@code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts`:
- Around line 5-13: The test currently replaces the entire
'storybook/internal/telemetry' module via vi.mock(...) with concrete
implementations for telemetry and isExampleStoryId; per guidelines switch to
using vi.mock(..., { spy: true }) and move concrete implementations into the
test setup (e.g., beforeEach) by clearing mocks (vi.clearAllMocks()) and
assigning behavior with vi.mocked(telemetry).mockImplementation(...) and
vi.mocked(isExampleStoryId).mockImplementation(...); update the existing mocked
symbols telemetry and isExampleStoryId accordingly so the module is spied
instead of fully replaced.
- Around line 118-142: The tests for onTestCaseResult currently only call
reporter.onTestCaseResult(...) without assertions; update each it block to
assert the reporter's collection behavior by checking an exposed result store
(e.g., call reporter.getCollectedResults() or inspect reporter._results) after
the call to verify the expected count or contents for the given storyId, and for
skipped cases assert that the store is unchanged or empty; if the reporter does
not expose state, either add a test-only accessor method to the reporter or
assert behavior via the next step (e.g., spy/stub the telemetry send function or
verify onTestRunEnd results) and otherwise add a brief comment clarifying the
test is intentionally a smoke/no-throw check.

In `@code/core/src/shared/utils/analyze-test-results.test.ts`:
- Line 6: Replace the mock import path used in the test so it includes the
explicit TypeScript extension: update the vi.mock call that references
'./categorize-render-errors' in analyze-test-results.test.ts to use
'./categorize-render-errors.ts' (keep the existing options like { spy: true }
unchanged) to follow the project's relative-import extension guideline.

In `@code/core/src/shared/utils/analyze-test-results.ts`:
- Around line 40-50: The reduction that builds categorizedErrors should use an
explicit interface instead of Record<string, any>; define a CategorySummary
type/interface (e.g., { uniqueCount: number; count: number; matchedDependencies:
string[] }) and change the reducer's generic from Record<string, any> to
Record<string, CategorySummary> (or to a TypedMap alias) so categorizedErrors,
the reduce initial type, and the mapped properties (uniqueCount, count,
matchedDependencies) are strongly typed; add the new type near the top of
analyze-test-results.ts and update the reduce call signature to use it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 915c3a4b-11d9-4ac3-85a7-4774f037f77e

📥 Commits

Reviewing files that changed from the base of the PR and between f351600 and 0834932.

📒 Files selected for processing (18)
  • code/addons/a11y/src/preview.tsx
  • code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts
  • code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.ts
  • code/addons/vitest/src/vitest-plugin/index.ts
  • code/core/src/core-server/index.ts
  • code/core/src/core-server/utils/ghost-stories/parse-vitest-report.test.ts
  • code/core/src/core-server/utils/ghost-stories/parse-vitest-report.ts
  • code/core/src/core-server/utils/ghost-stories/test-annotations.ts
  • code/core/src/core-server/utils/ghost-stories/types.ts
  • code/core/src/shared/utils/analyze-test-results.test.ts
  • code/core/src/shared/utils/analyze-test-results.ts
  • code/core/src/shared/utils/categorize-render-errors.test.ts
  • code/core/src/shared/utils/categorize-render-errors.ts
  • code/core/src/shared/utils/test-result-types.ts
  • code/core/src/telemetry/event-cache.ts
  • code/core/src/telemetry/index.ts
  • code/core/src/telemetry/types.ts
  • scripts/tsconfig.json
💤 Files with no reviewable changes (1)
  • scripts/tsconfig.json

Comment thread code/addons/a11y/src/preview.tsx Outdated
Comment thread code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.ts
Comment thread code/core/src/telemetry/event-cache.ts Outdated
@storybook-app-bot
Copy link
Copy Markdown

storybook-app-bot bot commented Apr 14, 2026

Package Benchmarks

Commit: e81a5db, ran on 15 April 2026 at 13:32:41 UTC

No significant changes detected, all good. 👏

Base automatically changed from sidnioulz/agentic-telemetry-ws1 to project/sb-agentic-setup April 14, 2026 14:04
Comment thread code/addons/a11y/src/preview.tsx Outdated
Comment thread code/core/src/telemetry/types.ts Outdated
Comment thread code/core/src/telemetry/types.ts Outdated
Comment thread code/core/src/shared/utils/categorize-render-errors.test.ts Outdated
Comment thread code/core/src/shared/utils/analyze-test-results.ts
Comment thread code/addons/vitest/src/vitest-plugin/index.ts
Copy link
Copy Markdown
Member

@Sidnioulz Sidnioulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the disabled telemetry detection, and see comment on error categorisation. All the rest LGTM!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/event-log-collector.ts (1)

75-85: ⚠️ Potential issue | 🟠 Major

--include/--exclude currently filter display only, not collection.

Docs at Line 13-Line 14 say “Only collect / Skip events”, but Line 73 and Line 81-Line 85 still store/write every event regardless of filter. This makes the option behavior inconsistent with its documented contract.

Suggested fix (if “collect” semantics are intended)
         const eventType = data.eventType || 'unknown';
         const entry = { receivedAt: new Date().toISOString(), ...data };
-        events.push(entry);
-
-        if (matchesFilter(eventType)) {
+        if (matchesFilter(eventType)) {
+          events.push(entry);
           console.log(`\n\x1b[1;32m[telemetry] ${eventType}\x1b[0m`);
           const logged = hideMetadata ? { ...data, metadata: undefined } : data;
           console.log(JSON.stringify(logged, null, 2));
-        }
-
-        await writeFile(
-          resolve(LOG_DIR, `events-${new Date().toISOString().slice(0, 10)}.jsonl`),
-          JSON.stringify(entry) + '\n',
-          { flag: 'a' }
-        );
+          await writeFile(
+            resolve(LOG_DIR, `events-${new Date().toISOString().slice(0, 10)}.jsonl`),
+            JSON.stringify(entry) + '\n',
+            { flag: 'a' }
+          );
+        }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/event-log-collector.ts` around lines 75 - 85, The code currently
calls writeFile for every incoming event (using writeFile(resolve(LOG_DIR,
`events-${new Date().toISOString().slice(0, 10)}.jsonl`), JSON.stringify(entry)
+ '\n', { flag: 'a' })) regardless of the include/exclude filter; change the
logic so persistence follows the same filter as display by wrapping the
writeFile call in the same matchesFilter(eventType) check (or apply the inverse
when using an exclude mode), and when hideMetadata is enabled ensure the
persisted object matches the displayed logged variable (use the same logged
value instead of entry) so only filtered events are stored and stored payloads
respect hideMetadata.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/eval/lib/grade.ts`:
- Line 264: Call runStoryTests with the ghostRun flag set to true when grading
component candidates so STORYBOOK_COMPONENT_PATHS gets injected; change the
invocation where result is assigned (the call to runStoryTests(candidates, {
cwd: projectPath })) to include ghostRun: true in the options object so the
component-transform plugin runs for candidate component files and avoids "No
tests found" failures.

In `@scripts/event-log-collector.ts`:
- Around line 35-39: The getFlag function's current logic uses args.indexOf(arg)
which can return the wrong index when flags repeat and doesn't check for a
missing value; change getFlag to iterate with an index (e.g., for (let i = 0; i
< args.length; i++)) so you can use the current index i to read the next token
safely, and for the exact-match case ensure i + 1 < args.length and that args[i
+ 1] is not another flag token (e.g., doesn't start with '-' or with the flag
prefix) before returning it; keep the existing startsWith(`${flag}=`) branch
unchanged and return undefined if no valid value is found.
- Around line 45-46: The RegExp constructors for includeRegex and excludeRegex
can throw on malformed patterns; wrap the creation of includeRegex (from
includePattern) and excludeRegex (from excludePattern) in try/catch blocks,
validate the pattern strings before constructing, and on error log a clear
user-facing message including the invalid pattern and the regex error, then exit
non-zero (or return) so the script fails cleanly; ensure you still set
includeRegex/excludeRegex to null when the corresponding pattern is falsy or
when you decide to continue.

---

Outside diff comments:
In `@scripts/event-log-collector.ts`:
- Around line 75-85: The code currently calls writeFile for every incoming event
(using writeFile(resolve(LOG_DIR, `events-${new Date().toISOString().slice(0,
10)}.jsonl`), JSON.stringify(entry) + '\n', { flag: 'a' })) regardless of the
include/exclude filter; change the logic so persistence follows the same filter
as display by wrapping the writeFile call in the same matchesFilter(eventType)
check (or apply the inverse when using an exclude mode), and when hideMetadata
is enabled ensure the persisted object matches the displayed logged variable
(use the same logged value instead of entry) so only filtered events are stored
and stored payloads respect hideMetadata.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9cf16072-4062-43e1-bdd0-615d03bcbeb1

📥 Commits

Reviewing files that changed from the base of the PR and between 175b59c and 8a008b2.

📒 Files selected for processing (16)
  • code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts
  • code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.ts
  • code/core/src/core-server/index.ts
  • code/core/src/core-server/server-channel/ai-setup-channel.ts
  • code/core/src/core-server/server-channel/ghost-stories-channel.test.ts
  • code/core/src/core-server/server-channel/ghost-stories-channel.ts
  • code/core/src/core-server/utils/ghost-stories/parse-vitest-report.ts
  • code/core/src/core-server/utils/ghost-stories/run-story-tests.ts
  • code/core/src/core-server/withTelemetry.ts
  • code/core/src/shared/utils/categorize-render-errors.test.ts
  • code/core/src/shared/utils/to-story-test-result.test.ts
  • code/core/src/shared/utils/to-story-test-result.ts
  • code/core/src/telemetry/ai-setup-utils.ts
  • code/core/src/telemetry/types.ts
  • scripts/eval/lib/grade.ts
  • scripts/event-log-collector.ts
✅ Files skipped from review due to trivial changes (2)
  • code/core/src/core-server/withTelemetry.ts
  • code/core/src/telemetry/types.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • code/core/src/shared/utils/categorize-render-errors.test.ts
  • code/addons/vitest/src/vitest-plugin/agent-telemetry-reporter.test.ts
  • code/core/src/core-server/index.ts
  • code/core/src/core-server/utils/ghost-stories/parse-vitest-report.ts

Comment thread scripts/eval/lib/grade.ts
logger.logStep(`Found ${candidates.length} candidate component(s) for ${label}`);

const result = await runGhostStories(candidates, { cwd: projectPath });
const result = await runStoryTests(candidates, { cwd: projectPath });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pass ghostRun: true for component-candidate grading.

Line 264 currently runs runStoryTests without ghostRun. That skips STORYBOOK_COMPONENT_PATHS injection (see code/core/src/core-server/utils/ghost-stories/run-story-tests.ts, Line 50-Line 52), so the Vitest component-transform plugin won’t run for candidate component files and this path can degrade into "No tests found" results.

💡 Proposed fix
-    const result = await runStoryTests(candidates, { cwd: projectPath });
+    const result = await runStoryTests(candidates, { cwd: projectPath, ghostRun: true });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/eval/lib/grade.ts` at line 264, Call runStoryTests with the ghostRun
flag set to true when grading component candidates so STORYBOOK_COMPONENT_PATHS
gets injected; change the invocation where result is assigned (the call to
runStoryTests(candidates, { cwd: projectPath })) to include ghostRun: true in
the options object so the component-transform plugin runs for candidate
component files and avoids "No tests found" failures.

Comment on lines +35 to +39
const getFlag = (flag: string): string | undefined => {
for (const arg of args) {
if (arg === flag) return args[args.indexOf(arg) + 1];
if (arg.startsWith(`${flag}=`)) return arg.slice(flag.length + 1);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard flag-value parsing against repeated flags and missing values.

Line 37 uses args.indexOf(arg), which can pick the first occurrence when a flag is repeated and can also treat another flag token as a value.

Suggested fix
 const args = process.argv.slice(2);
 const getFlag = (flag: string): string | undefined => {
-  for (const arg of args) {
-    if (arg === flag) return args[args.indexOf(arg) + 1];
+  for (let i = 0; i < args.length; i += 1) {
+    const arg = args[i];
+    if (arg === flag) {
+      const next = args[i + 1];
+      return next && !next.startsWith('--') ? next : undefined;
+    }
     if (arg.startsWith(`${flag}=`)) return arg.slice(flag.length + 1);
   }
   return undefined;
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const getFlag = (flag: string): string | undefined => {
for (const arg of args) {
if (arg === flag) return args[args.indexOf(arg) + 1];
if (arg.startsWith(`${flag}=`)) return arg.slice(flag.length + 1);
}
const getFlag = (flag: string): string | undefined => {
for (let i = 0; i < args.length; i += 1) {
const arg = args[i];
if (arg === flag) {
const next = args[i + 1];
return next && !next.startsWith('--') ? next : undefined;
}
if (arg.startsWith(`${flag}=`)) return arg.slice(flag.length + 1);
}
return undefined;
};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/event-log-collector.ts` around lines 35 - 39, The getFlag function's
current logic uses args.indexOf(arg) which can return the wrong index when flags
repeat and doesn't check for a missing value; change getFlag to iterate with an
index (e.g., for (let i = 0; i < args.length; i++)) so you can use the current
index i to read the next token safely, and for the exact-match case ensure i + 1
< args.length and that args[i + 1] is not another flag token (e.g., doesn't
start with '-' or with the flag prefix) before returning it; keep the existing
startsWith(`${flag}=`) branch unchanged and return undefined if no valid value
is found.

Comment on lines +45 to +46
const includeRegex = includePattern ? new RegExp(includePattern) : null;
const excludeRegex = excludePattern ? new RegExp(excludePattern) : null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify unguarded dynamic RegExp construction in this script
rg -n "new RegExp\\(" scripts/event-log-collector.ts
rg -n -C2 "includePattern|excludePattern" scripts/event-log-collector.ts

Repository: storybookjs/storybook

Length of output: 943


🏁 Script executed:

wc -l scripts/event-log-collector.ts

Repository: storybookjs/storybook

Length of output: 99


🏁 Script executed:

cat -n scripts/event-log-collector.ts

Repository: storybookjs/storybook

Length of output: 5530


Wrap RegExp construction in error handling to provide clear feedback on invalid patterns.

Lines 45-46 throw synchronously on malformed regex patterns (e.g., unmatched (), preventing the script from starting. The current code offers no user-friendly error message when this occurs.

Suggested fix
 const includePattern = getFlag('--include');
 const excludePattern = getFlag('--exclude');
-const includeRegex = includePattern ? new RegExp(includePattern) : null;
-const excludeRegex = excludePattern ? new RegExp(excludePattern) : null;
+const toRegex = (pattern: string | undefined, name: string): RegExp | null => {
+  if (!pattern) return null;
+  try {
+    return new RegExp(pattern);
+  } catch (error) {
+    throw new Error(`Invalid ${name} regex "${pattern}": ${(error as Error).message}`);
+  }
+};
+const includeRegex = toRegex(includePattern, '--include');
+const excludeRegex = toRegex(excludePattern, '--exclude');
 const hideMetadata = args.includes('--no-metadata');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const includeRegex = includePattern ? new RegExp(includePattern) : null;
const excludeRegex = excludePattern ? new RegExp(excludePattern) : null;
const includePattern = getFlag('--include');
const excludePattern = getFlag('--exclude');
const toRegex = (pattern: string | undefined, name: string): RegExp | null => {
if (!pattern) return null;
try {
return new RegExp(pattern);
} catch (error) {
throw new Error(`Invalid ${name} regex "${pattern}": ${(error as Error).message}`);
}
};
const includeRegex = toRegex(includePattern, '--include');
const excludeRegex = toRegex(excludePattern, '--exclude');
const hideMetadata = args.includes('--no-metadata');
🧰 Tools
🪛 ast-grep (0.42.1)

[warning] 45-45: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(excludePattern)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/event-log-collector.ts` around lines 45 - 46, The RegExp constructors
for includeRegex and excludeRegex can throw on malformed patterns; wrap the
creation of includeRegex (from includePattern) and excludeRegex (from
excludePattern) in try/catch blocks, validate the pattern strings before
constructing, and on error log a clear user-facing message including the invalid
pattern and the regex error, then exit non-zero (or return) so the script fails
cleanly; ensure you still set includeRegex/excludeRegex to null when the
corresponding pattern is falsy or when you decide to continue.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
code/core/src/telemetry/ai-setup-utils.test.ts (1)

254-259: Consider extracting a helper for payload-factory retrieval/assertion.

The same mock.calls[0][1] extraction and resolves.toMatchObject pattern is repeated three times; a small helper would keep the tests tighter and easier to maintain.

Also applies to: 276-279, 318-321

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/core/src/telemetry/ai-setup-utils.test.ts` around lines 254 - 259,
Extract a small test helper (e.g., getAndAssertPayloadFactory or
assertTelemetryPayloadFactory) that accepts the telemetry mock, expected object,
and optional call index; inside it, retrieve the factory via
vi.mocked(telemetry).mock.calls[callIndex ?? 0][1] as () => Promise<unknown>,
invoke await expect(factory()).resolves.toMatchObject(expected), and replace the
duplicated extraction/assertion lines at the three locations (the current uses
of telemetry.mock.calls[0][1] and similar at lines ~254, ~276, ~318) with calls
to this helper to DRY the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code/core/src/telemetry/ai-setup-utils.test.ts`:
- Around line 73-78: The shared mock currently executes payload factories inside
vi.mocked(telemetry).mockImplementation which causes collectAiSetupEvidence
tests to run payload-building twice; change the mock so it does not call
functions but instead returns payloadOrFactory as-is (i.e., if payloadOrFactory
is a function, return the function rather than invoking it) so tests can capture
and invoke the factory themselves; update the mockImplementation in
ai-setup-utils.test.ts (the vi.mocked(telemetry).mockImplementation) accordingly
and ensure tests that call collectAiSetupEvidence still assert by invoking the
captured factory.

---

Nitpick comments:
In `@code/core/src/telemetry/ai-setup-utils.test.ts`:
- Around line 254-259: Extract a small test helper (e.g.,
getAndAssertPayloadFactory or assertTelemetryPayloadFactory) that accepts the
telemetry mock, expected object, and optional call index; inside it, retrieve
the factory via vi.mocked(telemetry).mock.calls[callIndex ?? 0][1] as () =>
Promise<unknown>, invoke await
expect(factory()).resolves.toMatchObject(expected), and replace the duplicated
extraction/assertion lines at the three locations (the current uses of
telemetry.mock.calls[0][1] and similar at lines ~254, ~276, ~318) with calls to
this helper to DRY the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 374c8ca7-e3d6-4614-bd66-7a366ec73966

📥 Commits

Reviewing files that changed from the base of the PR and between 8a008b2 and e81a5db.

📒 Files selected for processing (1)
  • code/core/src/telemetry/ai-setup-utils.test.ts

Comment on lines +73 to +78
vi.mocked(telemetry).mockImplementation(async (_eventType, payloadOrFactory) => {
if (typeof payloadOrFactory === 'function') {
return payloadOrFactory();
}
return payloadOrFactory;
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid executing the telemetry payload factory inside the shared mock.

This runs payload-building logic during collectAiSetupEvidence() and then again when the test manually invokes the captured factory, which makes these tests more brittle for non-idempotent payloads.

Suggested change
 beforeEach(() => {
   vi.resetAllMocks();
-  vi.mocked(telemetry).mockImplementation(async (_eventType, payloadOrFactory) => {
-    if (typeof payloadOrFactory === 'function') {
-      return payloadOrFactory();
-    }
-    return payloadOrFactory;
-  });
+  vi.mocked(telemetry).mockResolvedValue(undefined);
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code/core/src/telemetry/ai-setup-utils.test.ts` around lines 73 - 78, The
shared mock currently executes payload factories inside
vi.mocked(telemetry).mockImplementation which causes collectAiSetupEvidence
tests to run payload-building twice; change the mock so it does not call
functions but instead returns payloadOrFactory as-is (i.e., if payloadOrFactory
is a function, return the function rather than invoking it) so tests can capture
and invoke the factory themselves; update the mockImplementation in
ai-setup-utils.test.ts (the vi.mocked(telemetry).mockImplementation) accordingly
and ensure tests that call collectAiSetupEvidence still assert by invoking the
captured factory.

@yannbf yannbf merged commit 7821847 into project/sb-agentic-setup Apr 15, 2026
125 checks passed
@yannbf yannbf deleted the sidnioulz/agentic-telemetry-ws2 branch April 15, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:normal maintenance User-facing maintenance tasks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants