feat(cli): init prompts to mine, mine handles Ctrl-C gracefully (#1181, #1182)#1183
feat(cli): init prompts to mine, mine handles Ctrl-C gracefully (#1181, #1182)#1183
Conversation
There was a problem hiding this comment.
Pull request overview
Improves the init → mine CLI flow by prompting to start mining immediately after initialization and by making mempalace mine handle Ctrl‑C with a clean summary + proper exit code, while also ensuring the hooks PID lock file is cleaned up on all exit paths.
Changes:
- Add
_maybe_run_mine_after_init()to optionally runmine()in-process aftercmd_init, with--yesskipping the prompt. - Catch
KeyboardInterruptinsideminer.mine()to print an interrupt summary and exit with code130, plus always run PID-file cleanup infinally. - Add unit tests covering the new prompt behavior, Ctrl‑C summary behavior, and PID-file cleanup semantics; document changes in the changelog.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
mempalace/cli.py |
Adds post-init “mine now?” prompt + --yes auto-mine helper. |
mempalace/miner.py |
Adds Ctrl‑C summary/exit(130) handling and PID-file cleanup in finally. |
tests/test_cli.py |
Adds tests for the init→mine prompt helper, including --yes, decline, EOF, and failure cases. |
tests/test_miner.py |
Adds tests for Ctrl‑C summary/exit code and PID-file cleanup behavior. |
CHANGELOG.md |
Documents the new init prompt and graceful Ctrl‑C behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Pre-scan so the user knows roughly what they're agreeing to before | ||
| # the prompt. The scan is fast and we'd run it inside mine() anyway. | ||
| try: | ||
| files = scan_project(project_dir) | ||
| file_count = len(files) | ||
| except Exception: | ||
| file_count = None | ||
|
|
||
| if auto: | ||
| print("\n Auto-mining (--yes).") | ||
| else: |
| # we don't block. User can re-run with --yes to opt in. | ||
| answer = "n" | ||
| if answer not in ("", "y", "yes"): | ||
| print(f"\n Skipped. Run `mempalace mine {project_dir}` when ready.") |
| f"\n Re-run `mempalace mine {project_dir}` to resume — " | ||
| "already-filed drawers are\n upserted idempotently and will not duplicate.\n" | ||
| ) |
bensig
left a comment
There was a problem hiding this comment.
Reviewed locally on `feat/init-mine-ux`. Approve, no asks. Tight scope, well-tested, backward-compat preserved by deliberate design.
Full pytest: 1247 passed in 48s — +13 from v3.3.3 baseline, mapping to the new test_cli/test_miner cases this PR adds.
Strengths
- Backward-compat flag matrix is the right call. `--yes` stays scoped to entity auto-accept; the new `--auto-mine` flag is the mine-skip toggle. Anyone running `mempalace init --yes ` in a script today gets exactly the same behaviour after this lands. The matrix in the PR body is explicit, and the inline comment in `cmd_init` ("`--yes` is SCOPED to entity auto-accept and does NOT imply mining") makes the design defendable in code review six months from now.
- Single corpus walk. `scan_project`'s output feeds both the user-visible scope estimate AND `mine()` via the new `files=` kwarg. No double-walk. The `mine()` signature change preserves backward-compat (`files=None` falls through to the existing `scan_project` call).
- Footgun mitigation is honest. Scope estimate prints BEFORE the prompt and the comment explicitly names the failure mode ("hitting Enter on a default-Y prompt with no size cue"). `<1 MB` floor on `_format_size_mb` so small projects never see a misleading `0 MB`.
- `EOFError` handling for piped stdin — non-interactive callers get "decline" instead of a `raise EOFError`. Won't block scripts that pipe `init` from somewhere.
- Errors surfaced, not swallowed. Mine failure → `sys.exit(1)` so downstream scripts see the non-zero status.
Ctrl-C handling
This is exactly the right level of mature:
- `try/except KeyboardInterrupt` around the file loop with an inner `try` capturing `last_file` for the summary.
- Outer `except` prints the clean `files_processed: N/M` / `drawers_filed: K` / `last_file: …` block.
- `sys.exit(130)` for standard SIGINT semantics — shells and CI runners read this correctly as user-interrupted.
- `finally` still runs (PID cleanup happens), and the comment explicitly calls out that a second Ctrl-C during the summary print propagates to the default handler. Knowing what you're not trying to catch is half the discipline.
- Resume guidance points at the idempotent re-mine path with deterministic IDs — already-filed drawers upsert to the same row, no duplicates. Honest description of why partial progress is safe to leave in place.
PID-file cleanup
`_cleanup_mine_pid_file` is its own function with a long docstring explaining the cross-session reasoning. The "only remove if the file claims our own PID" check is the part that prevents the bug where a separate concurrent mine gets its lock file yanked. Defensive in exactly the right way.
Minor observations (not blockers)
- `last_file` is updated inside both the inner exception handler and the post-`process_file` line — slightly redundant, but the comment explains why. Cleaner alternative would be to set `last_file` before `process_file()` so the variable always reflects "the file we attempted last", but the current pattern works.
- `_format_size_mb` and `_maybe_run_mine_after_init` extracted for unit-testability — exactly the right move and a contrast with the (still-good) PR #1185 where I had to ask for tests. Nothing to do here.
Ship it. Same review-quality bar as #1179.
`mempalace init` now ends with a `Mine this directory now? [Y/n]` prompt and runs `mine()` in-process when accepted; `--yes` skips the prompt and auto-mines for non-interactive callers. Declining prints the resume command. Removes the "remember to type the next command" friction since rooms + entities just got set up. `mempalace mine` now wraps its main loop in `try / except KeyboardInterrupt` and prints `files_processed`, `drawers_filed`, and `last_file` before exiting with code 130 on Ctrl-C. Re-mining is safe because deterministic drawer IDs make the upsert idempotent. The hooks PID lock at `~/.mempalace/hook_state/mine.pid` is now actively removed in a `finally` when its entry points at us, on clean exit, error, or interrupt — preventing the next hook fire from briefly waiting on a stale PID. Closes #1181, #1182.
…ore mine prompt
Reviewer feedback on the previous commit flagged two real problems:
1. Overloading --yes to also auto-mine was a silent behaviour change for
scripted callers. Today --yes only auto-accepts entities — making it
ALSO trigger a multi-minute ChromaDB write breaks every script that
currently runs `mempalace init --yes <dir>` for the fast non-interactive
entity path. Add a separate `--auto-mine` flag instead. Combinations:
mempalace init --yes <dir> # entities auto, STILL prompt mine
mempalace init --auto-mine <dir> # prompt entities, skip mine prompt
mempalace init --yes --auto-mine <dir> # fully non-interactive
--yes behaviour is now identical to pre-PR.
2. The mine prompt was firing without telling the user how big the job
was. On a real corpus mine takes minutes-to-tens-of-minutes; hitting
Enter on default-Y with no size cue is a footgun. Show a one-line
estimate computed from scan_project (the same walk we hand into mine)
BEFORE the prompt:
~423 files (~12 MB) would be mined into this palace.
Mine this directory now? [Y/n]
The estimate uses a single corpus walk: scan_project's output is
passed into mine() via a new optional files= kwarg, so we never walk
the tree twice.
Tests: replaced the old "--yes auto-mines" assertion with a regression
guard that --yes alone STILL prompts; added coverage for --auto-mine
alone, --yes --auto-mine together, and the pre-prompt estimate line.
b30fb25 to
23d534f
Compare
The "Skipped. Run mempalace mine <dir>" hint after declining the init prompt and the "Re-run mempalace mine <dir> to resume" hint after a Ctrl-C interruption both interpolated project_dir without shell-quoting. A path containing spaces or metacharacters produced a copy-paste-broken command. Both spots now use shlex.quote(project_dir). Adds regression tests covering each hint with a path that contains a space.
The pre-existing test_maybe_run_mine_prompt_declined_prints_hint
asserted the bare unquoted form `mempalace mine {tmp_path}`. After
the production code switched to shlex.quote on the resume hint, this
passed on Linux/macOS (POSIX paths have no characters that trigger
quoting) but failed on Windows where backslashes always get wrapped
in single quotes.
Mirror the production code in the assertion via shlex.quote so it's
portable across platforms; do the same for the two new
spaces-in-path tests for consistency.
After rebasing onto current develop: - chroma.py: keep develop's quarantine_stale_hnsw + UnsupportedFilterError validation alongside this PR's _pin_hnsw_threads retrofit. - tests/test_backends.py: combine quarantine_stale_hnsw and _pin_hnsw_threads test sections; ruff format. - miner.py: propagate the new `files=` kwarg (added on develop in MemPalace#1183 for the init -> mine flow) through _mine_impl so the caller can pass a pre-scanned file list under the global lock.
…HNSW fixes Bring in 29 commits from upstream/develop since the last merge (2026-04-23): Major absorbed changes: - MemPalace#976 (Felipe Truman): HNSW graph corruption fix, mine_global_lock for fan-out, MAX_PRECOMPACT_BLOCK_ATTEMPTS=2 for /compact deadlocks. Closes MemPalace#974/MemPalace#965/MemPalace#955; likely resolves MemPalace#1172 too. - MemPalace#1179 (Igor): CLI mempalace search routes through _hybrid_rank, legacy-metric warning + _warn_if_legacy_metric, invariant tests on hnsw:space=cosine across all 5 collection-creation paths. - MemPalace#1180/MemPalace#1183/MemPalace#1184: cross-wing topic tunnels, init mine UX, --auto-mine. - MemPalace#1185 (perf/batched-upsert-gpu): batched ChromaDB upserts, GPU device detection via mempalace/embedding.py. - MemPalace#1182: graceful Ctrl-C during mempalace mine. - MemPalace#1168: tunnel permissions security fix. Conflict resolutions (8 files): - searcher.py: kept fork's CLI delegation through search_memories (cleaner single-source-of-truth path); upstream's inline-retrieval branch dropped. Merged upstream's print formatting (cosine= + bm25=) with fork's matched_via reporting from sqlite BM25 fallback. - backends/chroma.py: kept fork's _BLOB_FIX_MARKER + palace_path arg to ChromaCollection (MemPalace#1171 write lock); merged upstream's **ef_kwargs (embedding_function support from MemPalace#1185). Removed duplicate _pin_hnsw_threads (fork already cherry-picked Felipe's earlier). - mcp_server.py: kept fork's palace_path arg + upstream's clearer comment on hnsw:num_threads=1 rationale. - miner.py: took upstream's serial mine() flow (mine_global_lock + Ctrl-C), RESTORED fork's strict detect_room — substring matching from upstream breaks fork-only test_detect_room_priority1_no_substring_match. Added `import re` for word-boundary regex matching. Fork-ahead concurrent mining (workers=, ThreadPoolExecutor from 5cd14bd) is dropped — daemon migration deprioritizes local concurrent mining; can re-port if needed. - CHANGELOG.md: combined fork's segfault-trio narrative with upstream's v3.3.4 release notes. - HOOKS_TUTORIAL.md: took MEMPALACE_PYTHON env var name (fork README was stale; hooks already use this name per fork-ahead MemPalace#19). - test_convo_miner_unit.py: took both contextlib + pytest imports. - test_searcher.py: imported upstream's 3 new TestSearchCLI tests but skipped them with TODOs — they assume upstream's inline-retrieval CLI with simpler mocks. Rewrite needed for fork's delegated search_memories path (sqlite BM25 fallback + scope counting). Test result: 1334 passed, 3 skipped (the upstream tests above), 3 warnings. Implications for open fork PRs: - MemPalace#1171 (cross-process write lock at adapter) becomes structurally redundant given MemPalace#976's mine_global_lock + daemon-strict serialization. Slated for close with thank-you to Felipe. - MemPalace#1173/MemPalace#1177 still relevant (quarantine threshold + blob_seq marker). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After today's work: - Merged upstream/develop (29 commits), absorbing MemPalace#976 (HNSW + mine_global_lock + PreCompact attempt-cap), MemPalace#1179 (CLI BM25), MemPalace#1180/MemPalace#1182/MemPalace#1183/MemPalace#1185 features. - Filed PR MemPalace#1198 (`_tokenize` None guard) upstream. - Closed PR MemPalace#1171 (adapter-level flock) — superseded by MemPalace#976 + daemon-strict. - palace-daemon migration commits (c09582c + 0e97b19) make the daemon the single canonical writer; in-tree adapter flock no longer carries its weight. README: - Lead paragraph: 2026-04-21 → 2026-04-25 sync date, 165,632 → 151,420 drawer count, daemon-on-disks topology, listed absorbed upstream PRs. - Fork change queue: dropped MemPalace#1171 row, added MemPalace#1198 row. - "Multi-client coordination" section: rewrote — palace-daemon is now the fork's primary answer (was "deferred"), MemPalace#1171 retired, narrative shifted to defense-in-depth around MemPalace#976 + daemon. - Two-layer memory model: storage path is now palace-daemon HTTP, not ~/.mempalace/palace. - Ecosystem palace-daemon row: marks integration as complete (commits cited). - Open upstream PRs table: refreshed to MemPalace org URLs, added MemPalace#1198 row, removed MemPalace#1171, added 2026-04-25 develop sync to merged list, added MemPalace#1171 to closed list with rationale. - Test count 1096 → 1334. CLAUDE.md: - Row 20 (MemPalace#1198): "Upstream PR pending" → "Filed as MemPalace#1198 on 2026-04-24". - Row MemPalace#1171 in PR table: open → closed (with MemPalace#976 supersession). - Added MemPalace#1198 row to PR table. - Top-of-section count: "14 merged, 7 open, 9 closed" → "14 merged, 7 open (including MemPalace#1198), 10 closed (added MemPalace#1171)" + sync date 2026-04-25. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After rebasing onto current develop: - chroma.py: keep develop's quarantine_stale_hnsw + UnsupportedFilterError validation alongside this PR's _pin_hnsw_threads retrofit. - tests/test_backends.py: combine quarantine_stale_hnsw and _pin_hnsw_threads test sections; ruff format. - miner.py: propagate the new `files=` kwarg (added on develop in MemPalace#1183 for the init -> mine flow) through _mine_impl so the caller can pass a pre-scanned file list under the global lock.
Summary
UX polish for the
init->mineuser journey, addressing two linked issues that touch the same flow.#1181:
init->mineprompt with scope estimateAfter entity confirmation, room detection, and the gitignore guard,
mempalace initnow shows a one-line scope estimate computed from its existing corpus walk and prompts the user:The estimate fires BEFORE the prompt -- on a real corpus mine takes minutes-to-tens-of-minutes, so hitting Enter on a default-Y prompt with no size cue would be a footgun. Empty answer /
y/yesrunsmine()in-process;nexits with the exactmempalace mine <dir>resume hint. Errors from the in-process mine surface viasys.exit(1)instead of being swallowed.Flag design (revised after review).
--yesis unchanged from before this PR -- still scoped to entity auto-accept, still prompts for the mine step. A new--auto-mineflag is what skips the mine prompt. This avoids silently changing behaviour for scripted callers runningmempalace init --yes <dir>today.mempalace init <dir>mempalace init --yes <dir>mempalace init --auto-mine <dir>mempalace init --yes --auto-mine <dir>The scope estimate uses a single corpus walk:
scan_project's output is passed intomine()via a new optionalfiles=kwarg, so we never walk the tree twice.#1182: graceful Ctrl-C in
mineWrapped the main file-processing loop in
try / except KeyboardInterrupt. On interrupt, prints a clean summary block (files_processed: N/M,drawers_filed: K,last_file: ...) plus the resume hint, then exits with code 130 (standard SIGINT). Already-filed drawers are upserted idempotently via deterministic IDs, so re-running picks up where it left off without duplicates. A second Ctrl-C during the summary print propagates to the default handler -- we don't try to handle everything.PID-file cleanup in
finallyThe hooks-side mine PID lock at
~/.mempalace/hook_state/mine.pid(from #1023) is now actively removed by the mine subprocess on every exit path (clean exit, error, interrupt) -- but only when the file's recorded PID matchesos.getpid(), so unrelated mines from other sessions are never disturbed. Stale entries previously relied on_pid_alive()returningFalse; this makes the cleanup observable and avoids edge cases with PID reuse on short-lived test runs.Why one PR
Both changes target the same
init->mine->Ctrl-Cuser journey. The init prompt path callsmine()directly, so the new try/except inmine()is what makes the in-process auto-mine safe to interrupt without leaving the init parent in a weird state. Reviewing them together keeps the story coherent.Tests
All new paths covered:
tests/test_cli.py::test_maybe_run_mine_prompt_accepted_runs_mine-- empty answer triggersmine().tests/test_cli.py::test_maybe_run_mine_prompt_yes_accepted_runs_mine--Yanswer triggersmine().tests/test_cli.py::test_maybe_run_mine_prompt_declined_prints_hint--nanswer skips and printsmempalace mine <dir>hint.tests/test_cli.py::test_maybe_run_mine_yes_alone_still_prompts-- regression guard:--yesdoes NOT auto-mine.tests/test_cli.py::test_maybe_run_mine_auto_mine_skips_prompt----auto-mineruns mine without callinginput().tests/test_cli.py::test_maybe_run_mine_yes_and_auto_mine_fully_noninteractive-- both flags together: never callinput(), always mine.tests/test_cli.py::test_maybe_run_mine_eof_on_stdin_treated_as_decline-- piped/non-interactive stdin doesn't crash.tests/test_cli.py::test_maybe_run_mine_failure_surfaces_via_exit-- mine errors exit non-zero with an error line on stderr.tests/test_cli.py::test_maybe_run_mine_estimate_appears_before_prompt-- file-count + size line renders before the prompt fires.tests/test_miner.py::test_mine_keyboard_interrupt_prints_summary_and_exits_130-- KeyboardInterrupt mid-loop -> SystemExit(130) + summary printed.tests/test_miner.py::test_mine_cleans_up_pid_file_on_interrupt-- our PID entry removed on interrupt.tests/test_miner.py::test_mine_cleans_up_pid_file_on_clean_exit-- our PID entry removed on clean exit.tests/test_miner.py::test_mine_does_not_remove_other_processes_pid_file-- foreign PID entries left untouched.Full suite: 1247 passed locally. Ruff check + format check both clean against the CI-pinned
ruff>=0.4.0,<0.5.Test plan
mempalace init <dir>interactively, accept the prompt -> mine runs, drawers land.mempalace init --yes <dir>-> entities auto-accepted, mine prompt STILL fires (unchanged from today).mempalace init --auto-mine <dir>-> entity prompt fires, mine prompt skipped.mempalace init --yes --auto-mine <dir>-> fully non-interactive, both auto.mempalace init <dir>interactively, answern-> exits with the resume hint, no mine.mempalace mine <large-dir>, hit Ctrl-C mid-loop -> clean summary, exit code 130, no traceback.mempalace mine <same-dir>-> resumes, no duplicate drawers.~/.mempalace/hook_state/mine.pidbefore/after a mine run -> cleared after exit.Closes #1181, #1182.