Skip to content

Fix Kiro usage command pipe hangs#1535

Merged
steipete merged 4 commits into
steipete:mainfrom
kiranmagic7:kiran/fix-kiro-bounded-command-runner
Jun 15, 2026
Merged

Fix Kiro usage command pipe hangs#1535
steipete merged 4 commits into
steipete:mainfrom
kiranmagic7:kiran/fix-kiro-bounded-command-runner

Conversation

@kiranmagic7

Copy link
Copy Markdown
Contributor

Summary

Fixes #1533.

Kiro usage refresh could still wait forever after the direct kiro-cli process exited or ignored termination, because the runner used an unbounded waitUntilExit() plus final EOF reads on stdout/stderr. This keeps the existing activity-based behavior, but bounds process cleanup and stops relying on EOF from inherited pipes.

The regression uses a fake kiro-cli that prints valid usage, starts a child that keeps stdout open, and exits. fetch() now returns from the parent output instead of waiting for the child.

Proof

  • swiftc -swift-version 6 -parse Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift Tests/CodexBarTests/KiroStatusProbeTests.swift

I also attempted the focused SwiftPM test locally, but SwiftPM hung while downloading Sparkle's binary artifact before it reached compilation. The added regression should run under normal CI package resolution.

@clawsweeper

clawsweeper Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codex review: needs real behavior proof before merge. Reviewed June 15, 2026, 1:30 AM ET / 05:30 UTC.

Summary
The PR refactors Kiro usage command execution to use bounded pipe capture and process-tree termination, hardens shared subprocess cleanup, and adds fake-process regressions.

Reproducibility: yes. at source level. Current main still reaches waitUntilExit after termination and then readDataToEndOfFile on stdout/stderr, matching the linked detached-child and ignored-SIGTERM reproduction shapes.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 7 files, +368/-110. The patch touches both the Kiro provider and shared process helpers, so runtime cleanup behavior needs proof beyond static parsing.
  • Regression coverage: 5 async subprocess/Kiro cases added. The added tests cover the core hang shapes, but they still need to run successfully on the latest head.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Post redacted terminal output, logs, or a terminal screenshot from a focused fake-cli or Swift test run on the latest head; redact private paths, endpoints, tokens, and account data.
  • [P1] Let the pending CI jobs complete for the latest head before merge.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body/comment include static parse, lint, diff-check output, and a stalled test attempt, but no after-fix real behavior proof from the latest head. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] The contributor has not posted after-fix real behavior proof from the latest head; static parsing, lint, and diff-check output do not prove the subprocess path returns in practice.
  • [P1] CI for the latest head is still pending, including lint-build-test and both Linux CLI builds.
  • [P1] The diff changes shared subprocess termination behavior as well as Kiro usage cleanup, so availability regressions need focused CI/proof before merge.

Maintainer options:

  1. Require runtime proof before merge (recommended)
    Ask for redacted terminal output or logs from a focused fake-cli or Swift test run on the latest head, then merge only after CI completes.
  2. Accept maintainer runtime risk
    Maintainers could merge after code review and CI alone, but that accepts subprocess availability risk without contributor-visible proof of the changed cleanup paths.

Next step before merge

  • [P1] Contributor proof and pending CI are the remaining merge gates; automation cannot supply real behavior proof from the contributor's setup.

Security
Cleared: No concrete security or supply-chain concern was found; the diff changes local subprocess lifecycle code and fake-process tests without new dependencies or secret handling.

Review details

Best possible solution:

Land the bounded Kiro runner after redacted focused fake-cli or Swift test output and CI confirm the inherited-pipe, ignored-SIGTERM, cancellation, and escaped-descendant paths pass without live Kiro credentials.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. Current main still reaches waitUntilExit after termination and then readDataToEndOfFile on stdout/stderr, matching the linked detached-child and ignored-SIGTERM reproduction shapes.

Is this the best way to solve the issue?

Yes, the branch follows the narrow maintainable direction by reusing bounded pipe capture and process cleanup while preserving Kiro validation semantics. The remaining blocker is proof and completed CI, not a clear code defect in the latest diff.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against fc8513228c5f.

Label changes

Label justifications:

  • P2: This is a normal-priority provider refresh hang fix with limited blast radius but real availability impact.
  • merge-risk: 🚨 availability: The PR changes subprocess lifecycle handling, and merging without runtime proof could leave hangs or cleanup regressions undiscovered.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body/comment include static parse, lint, diff-check output, and a stalled test attempt, but no after-fix real behavior proof from the latest head. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; its guidance to prefer focused provider/parser tests and avoid live provider or Keychain validation unless requested applies to this Kiro CLI review. (AGENTS.md:1, fc8513228c5f)
  • Current main still has the hang shape: Current main's Kiro runner still terminates with waitUntilExit and then reads stdout/stderr to EOF, matching the linked inherited-pipe and ignored-SIGTERM source reproduction. (Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift:503, fc8513228c5f)
  • Latest PR head uses bounded cleanup: At the latest PR head, cancellation terminates the process, captures are stopped, and stdout/stderr are drained via ProcessPipeCapture.finish(timeout:) rather than unbounded EOF reads. (Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift:478, 8976fa0656c8)
  • Regression coverage added: The PR adds fake-CLI/process tests for inherited pipes, ignored SIGTERM, completed no-output failure status, cancellation cleanup, and escaped descendants. (Tests/CodexBarTests/KiroStatusProbeTests.swift:11, 8976fa0656c8)
  • Related owner issue confirms fix shape: The linked issue describes the same unbounded wait and EOF-drain problem and asks for bounded escalation, bounded post-exit drains, and non-live regressions.
  • Proof and CI state: The PR body/comment provide static parse, lint, diff-check output, and a stalled SwiftPM attempt; gh reports lint-build-test and Linux CLI builds still pending for head 8976fa0. (8976fa0656c8)

Likely related people:

  • steipete: Authored recent bounded subprocess output draining, opened the linked owner issue, and force-pushed the latest hardening commits on this PR. (role: recent area contributor and likely follow-up owner; confidence: high; commits: ac8159f6b10c, b3f74cabeb52, 8976fa0656c8; files: Sources/CodexBarCore/Host/Process/ProcessPipeCapture.swift, Sources/CodexBarCore/Host/Process/SubprocessRunner.swift, Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift)
  • chadneal: History shows the activity-based Kiro idle handling and incremental capture behavior came from this contributor's Kiro timeout work. (role: introduced relevant timeout behavior; confidence: high; commits: a0bb8dd86010, e23a0c4411ea; files: Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift, Tests/CodexBarTests/KiroStatusProbeTests.swift)
  • neror: Git history and changelog provenance show the original Kiro provider, probe, descriptor, and tests came from this contributor's work. (role: original Kiro provider contributor; confidence: high; commits: bbda93528714; files: Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift, Sources/CodexBarCore/Providers/Kiro/KiroProviderDescriptor.swift, Tests/CodexBarTests/KiroStatusProbeTests.swift)
  • solnikhil: Recent Kiro account, usage, and display work touched the same provider and test surface. (role: recent Kiro usage contributor; confidence: medium; commits: 66b8277d87b0; files: Sources/CodexBarCore/Providers/Kiro/KiroStatusProbe.swift, Tests/CodexBarTests/KiroStatusProbeTests.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: edb315dcc3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

didTerminateForIdle = true
break
}
try await Task.sleep(for: .milliseconds(100))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clean up the subprocess when cancellation lands

When a provider refresh is superseded or its waiter is cancelled, UsageStore cancels the refresh task; with this new cancellable sleep, that cancellation throws out of runCommand while kiro-cli is still running. Because there is no cancellation handler or defer that clears the pipe handlers and terminates the process before rethrowing, a hung or slow kiro-cli can be left running in the background after the UI has abandoned the refresh.

Useful? React with 👍 / 👎.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 14, 2026
@kiranmagic7

Copy link
Copy Markdown
Contributor Author

Follow-up pushed in fba48abe to clear the lint blockers on this branch.

What changed:

  • Applied the repo SwiftFormat numeric-literal style in KiroStatusProbe.
  • Reworked the fake CLI child-process fixture as a heredoc so SwiftLint line length stays green without changing the fixture behavior.

Local proof:

$ ./Scripts/lint.sh lint
Done linting! Found 0 violations, 0 serious in 1098 files.

$ git diff --check
# no output

I also tried swift test --filter KiroStatusProbeTests --skip-update; SwiftPM stalled while downloading the Sparkle 2.9.1 binary artifact before compiling, so I am leaving focused test proof to the current CI run on fba48abe.

@steipete steipete force-pushed the kiran/fix-kiro-bounded-command-runner branch from fba48ab to ad14534 Compare June 15, 2026 05:02
@clawsweeper clawsweeper Bot added P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 availability 🚨 Merging this PR could cause crashes, hangs, restart loops, stalls, or process outages. labels Jun 15, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8976fa0656

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +499 to +500
async let stdoutData = stdoutCapture.finish(timeout: .seconds(1))
async let stderrData = stderrCapture.finish(timeout: .seconds(1))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Terminate pipe-holding helpers after bounded drain

When kiro-cli exits after printing complete usage but leaves a helper holding stdout/stderr open, process.isRunning is already false so the termination block above is skipped; these bounded finish calls then just stop reading after one second and return the snapshot. In that inherited-pipe scenario a hung or long-lived helper survives the refresh, so repeated refreshes can leak CLI child processes instead of only avoiding the UI hang.

Useful? React with 👍 / 👎.

@steipete steipete merged commit 209b6c8 into steipete:main Jun 15, 2026
4 checks passed
steipete added a commit that referenced this pull request Jun 15, 2026
Launch Kiro CLI commands in a dedicated pre-exec process group and clean up inherited-pipe holders and residual helpers using start-time-verified identities.

Add bounded termination/draining, portable pre-reap process tracking, Linux descriptor isolation, and focused lifecycle regressions.

Follow-up to #1533 and #1535.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 availability 🚨 Merging this PR could cause crashes, hangs, restart loops, stalls, or process outages. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kiro usage refresh can hang on process termination or inherited pipes

2 participants