feat(solve): experimental --keep-working-until-all-requirements-are-fully-done (#1883)#1884
Conversation
Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: #1883
…ully-done (#1883) Scan PR description, AI solution summary, and changed markdown for deferred-work indicators (out of scope, future work, follow-up PR, deferred, delayed, TODO/TBD, etc.) using ~14 regexes. When found, auto-restart the AI with the detected reasons plus a verbatim reinforcement prompt, until the scan is clean or the restart limit is hit (default 5; explicit count; forever/unlimited/0 -> no limit, with a 3 consecutive-error safety cap). - Pure, network-free detection in src/solve.keep-working.detect.lib.mjs (unit-tested) - Orchestration in src/solve.keep-working.lib.mjs - CLI option + normalization in src/solve.config.lib.mjs, aliases keep-going/keep-working - Wired into post-solve flow via applyRestartResult() in src/solve.mjs - 31 tests in tests/test-keep-working-until-done-1883.mjs - Docs in CONFIGURATION.md (+ru/zh/hi) and case study under docs/case-studies/issue-1883/ - Changeset (minor)
Working session summaryCI is fully green ( SummaryIssue #1883 is implemented and shipped on PR #1884, which is now marked ready for review with all CI checks passing. What was builtAn experimental
Limit semantics: bare → 5; explicit count → that number; Key files
Verification
PR: #1884 This summary was automatically extracted from the AI working session output. |
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $10.426117📊 Context and tokens usage:Claude Opus 4.8: (3 sub-sessions)
Total: (36.3K new + 338.5K cache writes + 10.7M cache reads) input tokens, 108.4K output tokens, $10.379106 cost Claude Haiku 4.5:
Total: 22.6K input tokens, 888 output tokens, $0.047011 cost 🤖 Models used:
📎 Log file uploaded as Gist (6720KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
✅ Ready to mergeThis pull request is now ready to be merged:
Monitored by hive-mind with --auto-restart-until-mergeable flag |
This reverts commit f4593f1.
Summary
Implements the experimental
solveoption requested in #1883:--keep-working-until-all-requirements-are-fully-done.After the main run (and any
--finalizepass), the feature scans three cheap,token-free sources — the PR description, the AI solution summary, and the
added lines of changed markdown documents — for strong indicators of deferred
work ("out of scope", "future work", "follow-up PR", "deferred", "delayed",
"TODO"/"TBD", etc.) using ~14 regular expressions. When indicators are found it
auto-restarts the AI tool with the concrete detected reasons plus the
verbatim reinforcement prompt from the issue, and repeats until the scan is
clean or the restart limit is reached.
Closes #1883.
Usage
Limit semantics: bare flag →
5; explicit number → that count;forever/unlimited/infinite/0→ no limit (with a hard safety cap of3 consecutive errors so a broken tool can never spin forever).
How it maps to the issue requirements
--keep-working-until-all-requirements-are-fully-donesrc/solve.config.lib.mjs([EXPERIMENTAL])runKeepWorkingUntilDoneinsrc/solve.keep-working.lib.mjsKEEP_WORKING_PROMPT+buildKeepWorkingFeedbackDEFERRED_WORK_PATTERNS(14 regexes)collectDeferredWorkSourcesforever/unlimitednormalizeKeepWorkingLimit--keep-going-until-all-requirements-are-fully-donedocs/case-studies/issue-{id}docs/case-studies/issue-1883/A full breakdown lives in
docs/case-studies/issue-1883/requirements.md.Design
src/solve.keep-working.detect.lib.mjs(regexes, limit normalization, feedback building) → fully unit-testable without
mocks, mirroring the repo's
auto-iteration-limits.lib.mjsidiom.gh api+ the restart loop) insrc/solve.keep-working.lib.mjs.src/solve.mjs(via a small sharedapplyRestartResulthelper that also de-duplicates the existingrestart/finalize cost-merge blocks).
reinforcement prompt does not self-trigger (unit-tested); the prompt and the
feedback block are never scanned; restarts are bounded; and
forevermode stillaborts after 3 consecutive tool errors. Each restart disables nested
keep-working to prevent recursion.
Tests
tests/test-keep-working-until-done-1883.mjs— 31 tests, all passing(detection, self-match avoidance, limit normalization for every CLI variant,
patch extraction, feedback rendering, and end-to-end CLI parsing of the flag and
its aliases).
npm run lintclean; docs-sync tests (test-docs-options-sync,test-docs-language-sync) pass acrossCONFIGURATION.md+.ru/.zh/.hi.How to reproduce the problem this fixes
Run an AI solver on a large issue without the flag: it frequently ships a partial
PR whose description says things like "caching is out of scope for this PR" or
leaves
TODOs, and reports the issue done. Because this workflow has no follow-upPR, that work is lost. With
--keep-working-until-all-requirements-are-fully-done, those phrases are detectedand the AI is restarted to finish them.
Docs
docs/CONFIGURATION.md(+.ru,.zh,.hi) — new option row.docs/case-studies/issue-1883/— deep case study (overview, full requirementlist with solution plans, root-cause analysis, existing-components/prior-art
survey, and the indicator catalogue).
minor.