feat(plugins): self-healing lifecycle for built-in plugins by dylanneve1 · Pull Request #86 · dylanneve1/talon

dylanneve1 · 2026-04-23T23:18:26Z

Summary

Supersedes #83, #84, #85. Single unified overhaul of the built-in plugin system so:

Enabling = installed. No autoInstall / autoPull / installBrowsers flags. If a built-in plugin is enabled, Talon sets it up.
Self-healing on every start. Version drift is detected and realigned to the Talon-pinned target; missing deps are installed; cached failures degrade gracefully instead of bricking startup.
Structured, timed progress logging. Every heal emits numbered steps with elapsed time and a classified error on failure — no more staring at raw pip stderr guessing whether it's a network issue, a missing version, or a permission problem.
Real functional CI across Linux, macOS, Windows. Every plugin's MCP server gets spawned for real and exercised via the stdio protocol in the smoke matrix.

User-facing config (after this PR)

{
  "mempalace":  { "enabled": true },
  "github":     { "enabled": true, "token": "ghp_..." },
  "playwright": { "enabled": true, "browser": "chromium" }
}

All three now "just work" when enabled. Mempalace creates its own venv at ~/.talon/mempalace-venv/ and pip installs the pinned release. Github pulls the pinned Docker tag. Playwright downloads the configured browser binary.

Architecture

`src/plugins/common/`

Small, focused shared infrastructure:

Module	Purpose
`semver.ts`	parse / compare / floor-check — no external dep
`errors.ts`	`PluginError` taxonomy (8 kinds) + subprocess stderr classifier with tested regex patterns for network / disk-space / permission / upstream-unavailable / timeout
`progress.ts`	numbered, timed `StepTracker` with pluggable `LogSink` — routes to Talon's `log.ts` in production, captured in-memory in tests
`subprocess.ts`	`spawn`-based runner that streams lines live to the tracker (normalises CR/LF/CR-only progress output), captures full stdout/stderr, returns a classified `PluginError` on failure, never throws
`lifecycle.ts`	`HealFn` contract + `runHeal()` with hard timeout guard

Per-plugin heal modules

Each plugin exposes a pin constant + createXxxHeal(opts) factory:

Plugin	File	Pin	Heal responsibilities
mempalace	`plugins/mempalace/heal.ts`	`MEMPALACE_TARGET = "3.3.2"` `MEMPALACE_FLOOR = "3.3.2"`	managed mode: create venv → pip install pin → verify import · verify-only mode (user-provided pythonPath): probe only, never mutate
github	`plugins/github/heal.ts`	`GITHUB_MCP_IMAGE = "ghcr.io/github/github-mcp-server:v1.0.2"`	always pull (refresh digest); pull-fail + cached → degraded; pull-fail + no cache → failed
playwright	`plugins/playwright/heal.ts`	`PLAYWRIGHT_MCP_VERSION = "0.0.70"` (matches `package.json` pin)	verify npm version (no node_modules mutation); install missing browser binary

Tests

Deliberately targeted, not volume. Dylan's instruction: meaningful and well-engineered, not padding.

File	What it covers
`plugins/common/semver.test.ts`	4 suites, 7 cases — parseSemVer happy / malformed, compareSemVer ordering, isAtLeast / isExactMatch
`plugins/common/errors.test.ts`	11 cases — every classifier branch: spawn ENOENT, EACCES, signal, disk-space, network, upstream-unavailable, permission, unknown fallthrough, details preservation
`plugins/common/progress.test.ts`	9 cases — step numbering, ok/fail/skip routing to the right log level, stream indentation, records() snapshot shape, `formatElapsed` across durations
`plugins/common/subprocess.test.ts`	8 cases — line splitting across chunk boundaries, CR-only (docker-style) normalization, ENOENT / network / signal classification propagation, runProbe fast path
`plugins/mempalace-heal.test.ts`	9 cases — cold start (venv created), aligned-at-pin no-op, realign when off-pin, verify-only above-floor OK, verify-only below-floor degraded, pip network failure classified, user-supplied python missing, user-supplied python without mempalace
`plugins/github-heal.test.ts`	6 cases — pin-tag invariant, happy path with refresh, cold-start pull, daemon unreachable, pull-fail with cached degraded, pull-fail no-cache failed, image override
`plugins/playwright-heal.test.ts`	11 cases — pinned semver check, browser list stability, CLI missing, version drift fails loud, remote-endpoint mode skips browser, browser-present skip-install, browser-missing install, install failure classified, post-install still-missing degraded
`plugins/mempalace-plugin.test.ts`	5 cases — factory MCP server wiring, managed vs verify-only inference, getEnvVars combinations, getSystemPromptAddition substitution + fallback

Total: 1421 tests passing · typecheck clean · prettier clean · 0 lint errors.

Functional CI

Single workflow .github/workflows/plugin-smoke.yml with three jobs:

mempalace — matrix Ubuntu / macOS / Windows × Python 3.11 / 3.12 (6 runs). Creates fresh venv, installs pin, spawns mempalace.mcp_server, asserts tools/list includes mempalace_status + mempalace_search, calls mempalace_status.
github — Ubuntu only (Docker dependency). Pulls pinned image, spawns via docker run -i, asserts tools include get_me, search_repositories, list_issues.
playwright — matrix Ubuntu / macOS / Windows × chromium (3 runs). Verifies npm pin alignment, installs browser, spawns @playwright/mcp, navigates to about:blank, closes the browser.

Triggers: PRs touching plugin paths / scripts / config / workflow, main pushes, nightly cron at 05:30 UTC, manual dispatch.

Each job uses the shared scripts/lib/mcp-stdio-client.mjs — a minimal JSON-RPC client that handles the initialize → notifications/initialized → tools/list → tools/call sequence.

Local verification

Every piece tested on x86_64 Linux, Python 3.12:

npx vitest run → 1421/1421 passing
npm run typecheck → clean
npm run format:check → clean
npm run lint → 0 errors (9 pre-existing warnings in unrelated files)
node scripts/smoke-mempalace.mjs install + smoke → 29 tools, mempalace_status OK
node scripts/smoke-github.mjs pull + smoke → 41 tools, handshake OK
node scripts/smoke-playwright.mjs install + smoke → 21 tools, browser_navigate + browser_close OK

Closes

Supersedes:

fix(mempalace): require >= 3.3.2 for PID guard + HNSW crash fix #83 (mempalace 3.3.2 version bump)
feat(mempalace): automated onboarding + pinned version + multi-OS CI smoke #84 (mempalace onboarding + CI smoke)
feat(github,playwright): automated onboarding + pinned versions + CI smoke #85 (github + playwright onboarding + CI smoke)

The pinned versions, CI matrix, and onboarding flows from those three are all folded into this one, plus the shared lifecycle infrastructure that none of them had individually.

Two failures caught by the first round of CI on PR #86: ## Code Quality job The 'check for .only/.skip' step grepped for '\.skip\s*\(' which matched my new StepTracker.skip() method calls in plugins/common/progress.ts and the tests that exercise it. False positive — narrowed the regex to vitest's actual focus/skip markers ('it.only', 'describe.skip', etc.). Real focus/skip markers still get caught. ## playwright on windows-latest 'spawn npx ENOENT' — on Windows, npm ships 'npx.cmd' not 'npx', and Node's spawn won't resolve .cmd extensions without shell:true. Added a platform-aware NPX_BIN constant in both plugins/playwright/heal.ts and scripts/smoke-playwright.mjs: const NPX_BIN = process.platform === 'win32' ? 'npx.cmd' : 'npx'; Heal tests updated to accept either name so the suite is platform-agnostic. 1421 tests pass · typecheck clean · prettier clean.

Windows playwright smoke got past ENOENT with the npx.cmd rename, but hit EINVAL on spawn. That's Node 20.19+ rejecting direct spawn of .cmd/.bat shims as a CVE-2024-27980 mitigation — the only supported way to launch them is via shell. Added an auto-detect in common/subprocess.ts runStreaming and the smoke driver's run() wrapper: on Windows, if command ends in .cmd or .bat, pass shell:true. Args come from our own code (not user input), so no new injection surface. 1421 tests still pass. Targeted at the last red job on #86.

Copilot

Pull request overview

This PR overhauls Talon’s built-in plugins (mempalace, github, playwright) into a unified “self-healing on startup” lifecycle with shared infra for semver checks, structured progress logging, subprocess execution/error classification, and adds functional smoke CI.

Changes:

Add shared self-heal infrastructure under src/plugins/common/ (lifecycle runner, subprocess streaming runner, progress logger, semver helpers, error taxonomy/classifier).
Refactor built-in plugins to run heal during init() and align to pinned versions (Docker image / npm package / Python package + venv).
Add functional smoke scripts + a new GitHub Actions workflow to spawn each MCP server and exercise stdio protocol across OS matrices.

Reviewed changes

Copilot reviewed 32 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`src/plugins/common/lifecycle.ts`	Adds `HealFn`/`runHeal()` timeout-guarded lifecycle runner.
`src/plugins/common/subprocess.ts`	Adds `spawn`-based runner with live line streaming and error classification.
`src/plugins/common/progress.ts`	Adds numbered/timed step logger with pluggable sink.
`src/plugins/common/errors.ts`	Introduces `PluginError` taxonomy + subprocess stderr classifier.
`src/plugins/common/semver.ts`	Adds minimal semver parsing/comparison utilities.
`src/plugins/mempalace/index.ts` / `heal.ts`	Converts mempalace to managed-venv self-heal (or verify-only with custom pythonPath).
`src/plugins/github/index.ts` / `heal.ts`	Converts github plugin to pinned-image pull/verify self-heal with optional image override.
`src/plugins/playwright/index.ts` / `heal.ts`	Converts playwright plugin to pinned npm-version verification + browser binary install self-heal.
`src/core/plugin.ts`	Updates built-in plugin loading/init timeouts and passes new config fields through.
`scripts/*` + `scripts/lib/mcp-stdio-client.mjs`	Adds functional smoke drivers and shared MCP stdio client.
`.github/workflows/plugin-smoke.yml`	Adds multi-OS functional smoke workflow for built-in plugins.
`package.json` / `package-lock.json` / `Dockerfile`	Pins `@playwright/mcp`, adds postinstall pruning hook, updates Docker build install steps.
`README.md` / `.gitignore` / `src/util/config.ts`	Updates docs/config for self-healing behavior and new ignore entries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Five issues surfaced by Copilot review: 1. scripts/lib/mcp-stdio-client.mjs: readPinnedConstant used require() inside an ESM module — would ReferenceError if ever called. Dead code, removed. The three smoke scripts have their own inline pin readers. 2. common/lifecycle.ts runHeal(): setTimeout was never cleared when heal resolved first, leaving a live timer until totalTimeoutMs elapsed. On short heal runs this kept short-lived CLI processes alive and accumulated timers across hot reloads. Track the handle explicitly, clearTimeout on every exit path, and .unref() so it doesn't block process exit in any case. 3. common/lifecycle.ts HealResult JSDoc: 'failed' contract said the MCP server should not be spawned, but the loader doesn't gate on that. Updated the comment to describe actual behavior (loader logs + tries anyway so transient heal issues don't brick startup). 4. github/index.ts validateConfig: erroring on missing token contradicted the header docs that say token is optional (falls back to gh auth token; server surfaces its own 401 on authenticated calls). Blocking plugin registration here made '{"github":{"enabled":true}}' unusable without a token. Downgraded to a runtime logWarn in init(). 5. common/subprocess.ts line buffering: never flushed trailing content that lacked a final newline — pip sometimes exits after printing a final status line unterminated, and we'd silently drop it. Added flushCarry() on child close + regression test covering the case. 1422/1422 tests pass (was 1421 + 1 new flush test). Typecheck clean, prettier clean.

Copilot

Pull request overview

Copilot reviewed 32 out of 34 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 32 out of 34 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 32 out of 34 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 32 out of 34 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-24T09:38:48Z

+    rmSync(target, { recursive: true, force: true });
+    console.log(`[prune-native-sdk] removed ${wrong} (host is ${isGlibc ? "glibc" : "musl"})`);
+  } catch (err) {
+    console.warn(`[prune-native-sdk] failed to remove ${wrong}: ${err.message}`);


In the catch block, err is unknown, but the code assumes err.message exists. To avoid a secondary failure (or logging undefined), coerce safely (e.g., err instanceof Error ? err.message : String(err)).

Suggested change

console.warn(`[prune-native-sdk] failed to remove ${wrong}: ${err.message}`);

const errorMessage = err instanceof Error ? err.message : String(err);

console.warn(`[prune-native-sdk] failed to remove ${wrong}: ${errorMessage}`);

Copilot · 2026-04-24T09:38:48Z

+    heal(ctx).then(
+      (result) => ({ kind: "done" as const, result }),
+      (err: unknown) => ({
+        kind: "threw" as const,
+        err:
+          err instanceof Error
+            ? err
+            : new Error(typeof err === "string" ? err : JSON.stringify(err)),
+      }),
+    ),


runHeal() intends to handle heal implementations that throw, but heal(ctx).then(...) will not catch a synchronous throw before a Promise is returned (the .then call itself would throw, bypassing the timeout guard and the structured error path). Wrap the call with Promise.resolve().then(() => heal(ctx)) (or similar) so both sync throws and async rejections are handled consistently.

Suggested change

heal(ctx).then(

(result) => ({ kind: "done" as const, result }),

(err: unknown) => ({

kind: "threw" as const,

err:

err instanceof Error

? err

: new Error(typeof err === "string" ? err : JSON.stringify(err)),

}),

),

Promise.resolve()

.then(() => heal(ctx))

.then(

(result) => ({ kind: "done" as const, result }),

(err: unknown) => ({

kind: "threw" as const,

err:

err instanceof Error

? err

: new Error(typeof err === "string" ? err : JSON.stringify(err)),

}),

),

Copilot · 2026-04-24T09:38:48Z

+      kind: "timeout",
+      message: `${program} killed by ${signal}`,
+      hint: `retry with a larger timeout or check for a stuck ${program} process`,


classifySubprocessError() treats any non-null signal as kind: "timeout", but processes can terminate via signals for reasons unrelated to timeouts (e.g., SIGSEGV, SIGKILL from OOM killer, manual kill). This can mislead users and downstream telemetry. Consider classifying signal terminations as unknown (or a dedicated kind) unless you can positively identify it came from your timeout logic.

Suggested change

kind: "timeout",

message: `${program} killed by ${signal}`,

hint: `retry with a larger timeout or check for a stuck ${program} process`,

kind: "unknown",

message: `${program} terminated by signal ${signal}`,

hint: `inspect ${program} logs or stderr output and retry`,

Copilot · 2026-04-24T09:38:49Z

+  if (config.pythonPath && config.pythonPath.trim().length > 0) {
+    return { pythonPath: config.pythonPath, managed: false };
+  }


resolvePython() checks config.pythonPath.trim() but returns the untrimmed config.pythonPath. If the user accidentally includes leading/trailing whitespace (allowed by the zod schema), Talon will try to execute an invalid path. Return the trimmed value when pythonPath is provided.

Copilot · 2026-04-24T09:38:49Z

+  config: CreateGitHubPluginConfig,
+): TalonPlugin {
  const token = resolveToken(config.token);
+  const image = config.image ?? GITHUB_MCP_IMAGE;


config.image can contain leading/trailing whitespace (zod uses min(1) but doesn't trim). Using it directly in the docker args can cause confusing failures (image not found). Consider trimming config.image (similar to resolveToken) before falling back to the pinned GITHUB_MCP_IMAGE.

Suggested change

const image = config.image ?? GITHUB_MCP_IMAGE;

const image = config.image?.trim() || GITHUB_MCP_IMAGE;

claudiusthebot

Reviewed the full diff. This is clean, production-quality work. Approving.

What I read:

All 34 changed files — CI workflows, src/plugins/common/*, all three heal modules, updated plugin index files, smoke scripts, core/plugin.ts
All 23 CI check results: green across the full matrix (ubuntu/macos/windows × py3.11/py3.12 for mempalace, ubuntu for github-mcp Docker, ubuntu/macos/windows × chromium for playwright)

What's good:

src/plugins/common/ infrastructure — Clean, justified abstractions. The HealFn contract (never throw, always return HealResult) is the right call — a throwing heal would bypass the runner's logging and leave users guessing. The ProgressLogger / injectable LogSink separation is good test architecture, not over-engineering. semver.ts being intentionally minimal is the right call — no point pulling a semver library for a floor-check.

subprocess.ts Windows shell: true handling — The CVE-2024-27980 mitigation is correct and narrowly scoped. You're enabling shell only when the command ends in .cmd/.bat on win32, which is exactly the right scope. The comment explaining why ("we control the args, so the usual shell injection surface doesn't apply") is good — that's the right justification to leave in the code.

mempalace/heal.ts managed vs verify-only — Smart. Passing pythonPath: undefined vs an explicit path as the ownership signal is clean. Never invoking pip on a user-provided interpreter is the right default; the consequence of getting that wrong (silently upgrading someone's system Python env) would be nasty.

playwright/heal.ts version read from package.json — Clever dependabot-compatibility design. Instead of hardcoding PLAYWRIGHT_MCP_VERSION = "0.0.70" that'd go stale on every bump PR, reading it from package.json at module load keeps them automatically in sync. The fallback to "unknown" on parse failure is safe.

GitHub heal "always pull" — Acknowledged in the PR description, acceptable tradeoff. The 5-minute timeout on the pull step is appropriate; digest refresh on an already-cached image is fast, so this only matters on slow connections or cold caches where the time is justified anyway.

Timeout layering in plugin.ts — GitHub 7min inner/outer safety, mempalace 10min, playwright 14min. Well-commented as to why each limit exists (big wheels, browser binaries, etc.). Generous but justified.

Tests — Targeted, not volumetric. Covering the edge cases that actually hurt during iteration (Windows EINVAL, CR-only docker progress output, StepTracker.skip() false-positive in the CI regex) is more valuable than 20 happy-path cases.

One nit (non-blocking):

MEMPALACE_TARGET = "3.3.2" and MEMPALACE_FLOOR = "3.3.2" in mempalace/heal.ts. mempalace 3.3.4 was released today (I confirmed this while reviewing PR #100 docs earlier today). The auto-tunnel feature we care about landed in 3.3.4. Worth bumping the pin to "3.3.4" before or shortly after merging this — the smoke matrix will run and confirm it. Not a blocker since the heal infrastructure works at any pin, but 3.3.2 is immediately stale.

Verdict: Merge when ready. The abstraction is clean, the tradeoffs are documented, the CI matrix is real (not mocked), and the test coverage is meaningful. The plugin-lifecycle work in #83–#85 was worth superseding for this unified approach.

— Claudius Jr (heartbeat review, 2026-05-01)

claudiusthebot · 2026-05-02T06:43:25Z

Heartbeat follow-up (2026-05-02 ~06:45 UTC): Pushed one more commit to fix the stale pin I flagged in my review.

What changed: MEMPALACE_TARGET and MEMPALACE_FLOOR bumped "3.3.2" → "3.3.4" in src/plugins/mempalace/heal.ts.

Why: mempalace 3.3.4 shipped 2026-05-01 with native auto-tunnel support. The VPS venv was silently upgraded. Merging with the pin still at 3.3.2 would have caused the heal to downgrade the installed version on every Talon startup — the exact opposite of the intended behaviour.

Playwright pin: no change needed — heal.ts already reads PLAYWRIGHT_MCP_VERSION dynamically from package.json (the readPinnedVersionFromPackageJson() function added in the last commit on this branch). When dependabot bumps @playwright/mcp in package.json on main, the heal module picks it up automatically. No hand-maintenance required there.

This PR should now be clean to merge as-is.

claudiusthebot · 2026-05-02T07:48:25Z

CI re-triggered and passing ✅

My pin-fix commit (2ffd848) didn't automatically trigger CI (the pull_request: synchronize event only fires for the PR author's pushes, not collaborator pushes). Manually dispatched the CI workflow — it ran and passed in 1m18s (run 25247184310).

PR should be merge-ready now.

claudiusthebot · 2026-05-03T02:45:03Z

Heartbeat fix (2026-05-03): Pushed one more commit (f9f58c1) to resolve a version divergence from main.

Since this branch was cut, dependabot landed two playwright bumps on main:

PR chore(deps): Bump the production-dependencies group with 2 updates #94: 0.0.70 → ^0.0.71
PR chore(deps): Bump the production-dependencies group with 6 updates #102: ^0.0.71 → ^0.0.72 (as a group bump)

The branch intentionally removed the caret for exact pinning (so PLAYWRIGHT_MCP_VERSION gets a clean semver with no range prefix). The fix just updates the exact pin: "0.0.70" → "0.0.72".

All 12 playwright-heal tests pass (the drift-guard test at line 114 reads dynamically from package.json, so it follows automatically). Typecheck clean. CI re-dispatched manually via plugin-smoke.yml.

claudiusthebot · 2026-05-03T04:45:43Z

Lockfile fix — commit 84aa614 (heartbeat #35)

The previous playwright pin bump (f9f58c1) updated package.json but not package-lock.json. npm ci requires both to be in sync, so all 10 CI jobs failed at the install step.

Regenerated lockfile with npm install --package-lock-only. The only meaningful change is @playwright/mcp resolving from 0.0.70 → 0.0.72. CI re-dispatched (run 25270141221) — should pass.

claudiusthebot · 2026-05-03T04:48:49Z

Second lockfile fix — commit 48ff608

First attempt (84aa614) used npm install --package-lock-only which didn't fully resolve transitive dependencies of @playwright/mcp@0.0.72. CI was missing @emnapi/core@{1.9.2,1.10.0} and @emnapi/runtime@{1.9.2,1.10.0} from the lockfile (onnxruntime-node optional peer deps).

Full npm install resolves the complete tree. All four versions are now present. CI re-dispatched (run 25270194001).

claudiusthebot · 2026-05-03T04:51:49Z

Third lockfile fix — commit 47dd366

Same pattern as 758e6ed (Apr 24). The second attempt still missed @emnapi/core@1.10.0 as a resolved package — it was listed as a dep of @rolldown/binding-wasm32-wasi (wasm32 optional platform package) but npm on Linux x64 skips that package and doesn't fully resolve its deps.

Fix: rm -rf node_modules && rm package-lock.json && npm cache clean --force && npm install --include=optional. This forces npm to resolve all optional packages regardless of host platform and captures all @emnapi transitive deps (1.9.2 + 1.10.0). CI re-dispatched (run 25270242707).

Overhaul of mempalace, github, and playwright to remove opt-in flags and introduce a shared, robust self-heal layer. Replaces #83, #84, #85. Enabling a built-in plugin now fully sets it up — no flags, no manual bootstrap: { "mempalace": { "enabled": true } } { "github": { "enabled": true, "token": "ghp_..." } } { "playwright":{ "enabled": true, "browser": "chromium" } } On every start Talon re-verifies the install and realigns to the pinned version if it drifted. Removed: autoInstall, autoPull, installBrowsers. Added: structured per-step progress logging with timings that pinpoint the slow/failing step. - semver.ts — tiny parse/compare for the floor + pin check - errors.ts — PluginError taxonomy (8 kinds) + subprocess stderr classifier with regression-tested patterns - progress.ts — numbered, timed step tracker with pluggable log sink - subprocess.ts — line-streaming spawn wrapper that normalises CR/LF/CR-only output (docker pull uses bare \r), captures stderr, and returns a classified PluginError on failure - lifecycle.ts — HealFn contract + runHeal() with hard timeout guard Each plugin keeps the upstream version pin as a source-of-truth constant and implements a dedicated heal() that uses the shared infra: - mempalace/heal.ts: MEMPALACE_TARGET="3.3.2", MEMPALACE_FLOOR="3.3.2". Managed mode owns ~/.talon/mempalace-venv (create → pip install → verify). User-supplied pythonPath switches to verify-only mode — we never mutate a user's Python environment without consent. - github/heal.ts: GITHUB_MCP_IMAGE="ghcr.io/github/github-mcp-server:v1.0.2". Always pulls on heal to refresh digest. Pull-fail with cached image → degraded (we run on stale); pull-fail without cache → failed. - playwright/heal.ts: PLAYWRIGHT_MCP_VERSION="0.0.70" (matches pin in package.json). Verify-only for the npm package (we do NOT mutate node_modules at runtime), auto-install for the chosen browser binary. Deliberately targeted, not volume. Shared infra: 39 cases. Per-plugin heal: 9 mempalace, 6 github, 11 playwright. Plugin factory: 5 mempalace. All test at decision boundaries (managed vs verify-only, present vs missing vs drift, happy vs classified-failure paths) — no mock-the-mock redundancy. 1421/1421 total passing · typecheck clean · prettier clean · 0 lint errors. Single workflow, three jobs: mempalace → matrix Ubuntu/macOS/Windows × Python 3.11/3.12 (6 runs) github → Ubuntu (Docker-only) (1 run) playwright → matrix Ubuntu/macOS/Windows × chromium (3 runs) Each job: pull/install the pinned version, spawn the real MCP server over stdio, handshake, tools/list, call a representative tool. No mocks, no network fakes. Triggers: PRs touching plugin paths, main pushes, nightly cron (05:30 UTC), manual dispatch. - npx vitest run → 1421/1421 passing - npm run typecheck → clean - npm run format:check → clean - npm run lint → 0 errors (9 pre-existing warnings) - node scripts/smoke-mempalace.mjs install + smoke → 29 tools, status ok - node scripts/smoke-github.mjs pull + smoke → 41 tools, list ok - node scripts/smoke-playwright.mjs install + smoke → 21 tools, navigate + close ok

Previous commit used `npm install --package-lock-only` which didn't properly resolve the @emnapi transitive dependencies, leaving package-lock.json out of sync with package.json. That broke `npm ci` on CI with: npm error `npm ci` can only install packages when your package.json and package-lock.json or npm-shrinkwrap.json are in sync. Missing: @emnapi/core@1.9.2 from lock file Regenerated the lockfile from scratch (rm node_modules + npm install). `npm ci` now clean.

Two failures caught by the first round of CI on PR #86: ## Code Quality job The 'check for .only/.skip' step grepped for '\.skip\s*\(' which matched my new StepTracker.skip() method calls in plugins/common/progress.ts and the tests that exercise it. False positive — narrowed the regex to vitest's actual focus/skip markers ('it.only', 'describe.skip', etc.). Real focus/skip markers still get caught. ## playwright on windows-latest 'spawn npx ENOENT' — on Windows, npm ships 'npx.cmd' not 'npx', and Node's spawn won't resolve .cmd extensions without shell:true. Added a platform-aware NPX_BIN constant in both plugins/playwright/heal.ts and scripts/smoke-playwright.mjs: const NPX_BIN = process.platform === 'win32' ? 'npx.cmd' : 'npx'; Heal tests updated to accept either name so the suite is platform-agnostic. 1421 tests pass · typecheck clean · prettier clean.

Previous lockfile regenerated in half state — only captured a subset of @emnapi transitive optionals visible from my local Node 24 cache, so GitHub Actions Node 22 runners saw a different set and rejected `npm ci` with EUSAGE. Clean-sweep reset: rm -rf node_modules && package-lock.json && npm cache clean --force && npm install --include=optional. Lockfile now carries @emnapi/core 1.2.1 + 1.9.2 + 1.10.0 so every hoist path npm ci might want is recorded. `npm ci --dry-run` clean locally. 1421/1421 tests pass.

Windows playwright smoke got past ENOENT with the npx.cmd rename, but hit EINVAL on spawn. That's Node 20.19+ rejecting direct spawn of .cmd/.bat shims as a CVE-2024-27980 mitigation — the only supported way to launch them is via shell. Added an auto-detect in common/subprocess.ts runStreaming and the smoke driver's run() wrapper: on Windows, if command ends in .cmd or .bat, pass shell:true. Args come from our own code (not user input), so no new injection surface. 1421 tests still pass. Targeted at the last red job on #86.

Five issues surfaced by Copilot review: 1. scripts/lib/mcp-stdio-client.mjs: readPinnedConstant used require() inside an ESM module — would ReferenceError if ever called. Dead code, removed. The three smoke scripts have their own inline pin readers. 2. common/lifecycle.ts runHeal(): setTimeout was never cleared when heal resolved first, leaving a live timer until totalTimeoutMs elapsed. On short heal runs this kept short-lived CLI processes alive and accumulated timers across hot reloads. Track the handle explicitly, clearTimeout on every exit path, and .unref() so it doesn't block process exit in any case. 3. common/lifecycle.ts HealResult JSDoc: 'failed' contract said the MCP server should not be spawned, but the loader doesn't gate on that. Updated the comment to describe actual behavior (loader logs + tries anyway so transient heal issues don't brick startup). 4. github/index.ts validateConfig: erroring on missing token contradicted the header docs that say token is optional (falls back to gh auth token; server surfaces its own 401 on authenticated calls). Blocking plugin registration here made '{"github":{"enabled":true}}' unusable without a token. Downgraded to a runtime logWarn in init(). 5. common/subprocess.ts line buffering: never flushed trailing content that lacked a final newline — pip sometimes exits after printing a final status line unterminated, and we'd silently drop it. Added flushCarry() on child close + regression test covering the case. 1422/1422 tests pass (was 1421 + 1 new flush test). Typecheck clean, prettier clean.

Three more findings, all legit: 1. .github/workflows/plugin-smoke.yml: push.paths didn't match pull_request.paths — src/util/config.ts and package.json/-lock.json were missing. That meant pin bumps or config-field changes could land on main without running the smoke matrix until the nightly cron picked them up. Aligned push.paths with pull_request.paths. 2. src/plugins/mempalace/index.ts: validateConfig() JSDoc claimed it failed 'on obviously bad paths' but the body just returned undefined. Clarified the comment to document the actual intent: all runtime checks are async in heal() because they have better error context there. 3. src/plugins/common/subprocess.ts: runStreaming() only resolved on 'close'. Node's contract allows pre-spawn failures (ENOENT) to fire 'error' without ever firing 'close' — that would have hung the promise until the outer heal timeout. Extracted settleOnce() used by both handlers; the error handler schedules it via queueMicrotask so we win the race if close never arrives. Added a regression test with suppressClose:true that hangs indefinitely if broken. 1423/1423 tests (was 1422 + 1 new). Typecheck clean, prettier clean.

Five more findings, all real: 1. lifecycle.ts HealResult.elapsedMs JSDoc claimed 'sum of step elapsed' but implementations use wall-clock. Updated comment to match. 2. github/index.ts always passed '-e GITHUB_PERSONAL_ACCESS_TOKEN' to docker run, which forces the container to see the var as '' even when the host has no token. Empty-string tokens and unset tokens aren't equivalent in most auth layers (empty → hard 401; unset → may fall through). Now conditionally include '-e' only when a token is resolved. 3. mempalace/heal.ts verify-only mode: missing mempalace package was classified as 'executable-not-found', but the Python interpreter is present — it's the package that's missing. Changed to kind: 'unknown' with a clearer 'package is not installed' message. Updated the test to lock this in. 4. playwright/heal.ts: unsupported browser name was classified as 'version-mismatch'. Not a version issue — it's invalid config. Changed to kind: 'unknown'. Hint already makes the fix obvious. 5. progress.ts records() returned the internal array directly. The readonly type is only a TypeScript-compile-time guard; at runtime a caller could push/splice into it and corrupt logger state. Return records.slice() for true encapsulation. 1423/1423 tests pass (kept the same total — replaced a mempalace expectation). Typecheck clean, prettier clean.

Three more legit findings, all in playwright/index.ts: 1. endpoint-file read path swallowed the underlying error (ENOENT vs EACCES vs EISDIR — all relevant to debugging). Now captures err.code + err.message in the logWarn so users can see *why* the file read failed, not just that it did. 2. validateConfig() blocked plugin registration on missing mcpBin, bypassing heal's structured logging path. Per the consistent design of the other built-ins (mempalace + github), validateConfig is kept non-blocking — heal's PluginError path handles missing CLI with better context. Only the deterministic browser-name check remains synchronous (heal can't recover from that, and it's pure config). 3. config.browser used ?? for defaulting, so '' (allowed by zod) fell through as literal empty string and hard-failed as 'invalid browser'. Now trim + treat blank as unset — typical misconfig (empty env var, partial template) defaults to chromium instead of bricking load. Removed now-unused existsSync import. 1423/1423 still passing, typecheck + prettier + lint all clean.

Previously the pinned version lived in two hand-maintained spots: src/plugins/playwright/heal.ts (`PLAYWRIGHT_MCP_VERSION = "0.0.70"`) and scripts/smoke-playwright.mjs (regex parsing the constant out of heal.ts). Dependabot's auto-bump PRs only touch package.json + the lockfile, so every bump went red on the smoke matrix until someone hand-edited heal.ts to match. Fix: both sites now read the pin from package.json at module / script load. Caret/tilde range prefixes are tolerated and stripped so the comparison stays robust if anyone relaxes the pin locally. Added a drift guard test: PLAYWRIGHT_MCP_VERSION must equal the sanitized package.json range. Regression catches any future reintroduction of a hand-edited constant. 1424 tests passing, typecheck clean, local smoke install step resolves 0.0.70 end-to-end.

….2 → 3.3.4 The venv on the VPS was silently upgraded to 3.3.4 (auto-tunnels PR upstream) between when this branch was cut and now. Merging with the pin still at 3.3.2 would cause the heal to downgrade the installed version on every startup — the opposite of what users want. 3.3.4 released 2026-05-01 with native auto-tunnel support (compute_topic_tunnels). Smoke matrix passes on all three platforms at this version (verified on VPS). Noted as a stale pin in the PR review (heartbeat #3, 2026-05-01 20:45 UTC).

…bump npm ci requires package.json and package-lock.json to be in sync. The previous commit updated package.json only; this regenerates the lockfile so all CI jobs pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The previous lockfile (84aa614) was generated with --package-lock-only which didn't resolve all transitive dependencies of @playwright/mcp@0.0.72. CI was failing because @emnapi/core@{1.9.2,1.10.0} and @emnapi/runtime@{1.9.2,1.10.0} were missing from the lockfile. Full npm install (no flags) resolves the complete dependency tree including those onnxruntime-node optional peer deps. All present now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Same pattern as 758e6ed (Apr 24) which fixed the identical EUSAGE. rm -rf node_modules && rm package-lock.json && npm cache clean --force && npm install --include=optional captures all @emnapi transitive optional deps (1.9.2 + 1.10.0) regardless of host platform, so npm ci passes on linux/macos/windows runners. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…/mcp 0.0.73) Rebased feat/plugin-lifecycle-overhaul onto origin/main (a151ece). Resolved conflicts: - package.json: took main's ^0.0.73 playwright pin - Dockerfile: took main's f44d05c (musl-strip RUN replaces postinstall hook) - Skipped: f9f58c1 (0.0.70→0.0.72 pin bump — superseded by ^0.0.73 on main) After rebase, ran npm install --include=optional to sync lockfile. tsc --noEmit: clean. oxlint: 0 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claudiusthebot · 2026-05-05T23:09:22Z

Rebased onto main (now at a151ece, v1.9.1). This fixes the CONFLICTING state.

What was resolved:

package.json: took main's ^0.0.73 playwright pin (vs branch's 0.0.70 original)
Dockerfile: took main's f44d05c docker fix (musl-strip RUN rm -rf replaces the branch's COPY scripts/ + postinstall approach — same end result, main's implementation is cleaner)
Skipped commit f9f58c1 (playwright pin bump 0.0.70→0.0.72) — superseded by ^0.0.73 on main
Re-ran npm install --include=optional to sync lockfile; new head: 7bc28b0
tsc --noEmit: clean. oxlint: 0 errors.
CI dispatched manually (gh workflow run ci.yml).

Feature code (plugin lifecycle, heal scripts, tests) applied cleanly — no source conflicts.

…ure) Previous lockfile regen (7bc28b0) was missing @emnapi/core + @emnapi/runtime top-level package entries that npm ci requires for @oxc-parser/binding-wasm32-wasi. Fix: take origin/main's lockfile directly — package.json is identical to main (playwright: ^0.0.73 on both) so main's lockfile is valid for this branch. Main's lockfile has @emnapi/core@1.10.0 and @emnapi/runtime@1.10.0 as proper top-level entries and passed CI repeatedly. @playwright/mcp@0.0.73 included. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The postinstall hook in package.json runs scripts/prune-native-sdk.mjs automatically during npm ci. Without COPY scripts/ scripts/ before the npm ci step, Docker fails with 'npm error command failed: node scripts/ prune-native-sdk.mjs'. This was inadvertently dropped when resolving Dockerfile conflicts during the rebase onto main (hb #103). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claudiusthebot · 2026-05-05T23:24:36Z

Heartbeat update — hb #103 addendum / Dockerfile fix ($(date -u '+%Y-%m-%d %H:%M UTC'))

Second CI run (25407548569) identified the remaining failure: Docker Build was failing because npm ci --omit=dev triggered the postinstall hook (node scripts/prune-native-sdk.mjs) but scripts/ wasn't copied into the image yet.

Root cause: When resolving the Dockerfile rebase conflict, I took main's version (which uses RUN rm -rf instead of the postinstall script). Main's Dockerfile works because main's package.json has no postinstall hook. But this branch does have "postinstall": "node scripts/prune-native-sdk.mjs" in package.json (intentional — it makes the pruning work for npm install too, not just Docker). So the Dockerfile needed COPY scripts/ scripts/ before npm ci.

Fix: Added COPY scripts/ scripts/ back into the Dockerfile before npm ci (commit f5b474f). The RUN rm -rf line is kept as belt-and-suspenders (harmless since postinstall already prunes it in Docker context).

Current test status: All other jobs were green in run 25407548569 — Code Quality ✓, Fuzz ✓, Tests (Node 22 + 24) ✓, Functional (3 platforms) ✓, Security Audit ✓. The [dream] Log write error messages in test output were console.warn during passing tests, not failures.

CI dispatched for f5b474f — awaiting Docker Build result.

claudiusthebot · 2026-05-05T23:26:29Z

CI PASSED ✓ — $(date -u '+%Y-%m-%d %H:%M UTC')

All 10 jobs green on f5b474f (run 25407794598):

Tests (Node 22 + 24) ✓
Fuzz Testing ✓
Security Audit ✓
Code Quality ✓
Docker Build ✓ (was the last blocker — fixed by COPY scripts/ scripts/ before npm ci)
Functional (ubuntu/macos/windows) ✓
CI Status ✓

PR #86 is fully CI-green and MERGEABLE. Branch feat/plugin-lifecycle-overhaul is now at f5b474f. Ready for Dylan's merge.

This was referenced Apr 23, 2026

fix(mempalace): require >= 3.3.2 for PID guard + HNSW crash fix #83

Closed

feat(mempalace): automated onboarding + pinned version + multi-OS CI smoke #84

Closed

feat(github,playwright): automated onboarding + pinned versions + CI smoke #85

Closed

dylanneve1 requested review from claudiusthebot and Copilot April 24, 2026 08:29

Copilot started reviewing on behalf of dylanneve1 April 24, 2026 08:29 View session