perf: paginate miner.status() — fixes SQLITE_MAX_VARIABLE_NUMBER crash past ~32K drawers#1036
perf: paginate miner.status() — fixes SQLITE_MAX_VARIABLE_NUMBER crash past ~32K drawers#1036jphein wants to merge 1 commit intoMemPalace:developfrom
Conversation
…h past ~32K drawers Closes MemPalace#802 (reported 1 week ago) and MemPalace#1015 (reported today, same crash). Before: total = col.count() r = col.get(limit=total, include=["metadatas"]) On palaces with more than ~32,766 drawers, ChromaDB's underlying SQLite query builds a SELECT with `total` bound variables and exceeds `SQLITE_MAX_VARIABLE_NUMBER` (default 32,766), raising: sqlite3.OperationalError: too many SQL variables `mempalace status` then crashes before printing anything. After: offset = 0 while offset < total: r = col.get(limit=10000, offset=offset, include=["metadatas"]) ... offset += len(metas) Each page stays well under the variable cap, and the loop terminates when ChromaDB returns fewer than the requested batch. Also switches the printed counts to thousand-separated decimals (`152,682` vs `152682`) — easier to read on large palaces. Verified on a 152,682-drawer palace that previously crashed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…emPalace#1036 Overnight/morning: - MemPalace#681, MemPalace#1000, MemPalace#1023 merged — moved from "Still ahead" to "Merged upstream (post-v3.3.1)" - bensig reviewed MemPalace#659 (wing_ prefix + agent filter) and MemPalace#1021 (silent_guard default) — both addressed on their PR branches - MemPalace#673 needed re-rebase after overnight develop merges; done - MemPalace#1036 filed: paginate miner.status(), closes upstream MemPalace#802 and MemPalace#1015 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes mempalace status crashing on large palaces by avoiding an unbounded col.get(limit=total, ...) call that can exceed SQLite’s SQLITE_MAX_VARIABLE_NUMBER via ChromaDB’s query plan.
Changes:
- Paginate
miner.status()metadata retrieval in 10k batches usinglimit/offsetto stay under SQLite bind-variable limits. - Update status output to use thousands-separated numeric formatting for drawer and room counts.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| scanned = 0 | ||
| batch = 10000 | ||
| offset = 0 | ||
| while offset < total: | ||
| r = col.get(limit=batch, offset=offset, include=["metadatas"]) | ||
| metas = r.get("metadatas") or [] | ||
| if not metas: | ||
| break | ||
| for m in metas: | ||
| m = m or {} | ||
| wing_rooms[m.get("wing", "?")][m.get("room", "?")] += 1 | ||
| scanned += len(metas) | ||
| offset += len(metas) |
There was a problem hiding this comment.
The new pagination/offset loop isn’t covered by tests. There are existing unit tests for status() in tests/test_miner.py, but none assert multi-page behavior (e.g., that col.get() is called with offset advancing and that results across pages are fully tallied). Please add a regression test using a fake collection with count() > batch and a get(limit, offset, ...) that returns deterministic slices, then assert the output totals match the full dataset and multiple get() calls occur.
|
|
||
| print(f"\n{'=' * 55}") | ||
| print(f" MemPalace Status — {len(metas)} drawers") | ||
| print(f" MemPalace Status — {scanned:,} drawers") |
There was a problem hiding this comment.
The header prints scanned (rows successfully fetched) rather than total (from col.count()). If count() and paged get() ever disagree (concurrent deletes, backend truncation/bug), the command will silently under-report drawers. Consider printing total as the authoritative count, and if scanned != total emit a clear “partial” indicator (similar to how tool_status exposes partial).
| print(f" MemPalace Status — {scanned:,} drawers") | |
| print(f" MemPalace Status — {total:,} drawers") | |
| if scanned != total: | |
| print(f" partial: scanned {scanned:,} of {total:,} drawers") |
Session-local drift since the morning update — keeps the lead paragraph accurate for readers coming from MemPalace#1017 or MemPalace#1036.
|
I should have checked for existing PRs before filing — @eldar702 already filed #1016 yesterday with the same pagination approach, narrower scope (pure bug fix; mine also adds thousand-separator print formatting). #1016 is the focused version and got here first. Happy to close this in favor of #1016, or rescope to just the formatting change as a follow-up — whichever the maintainers prefer. |
|
Further correction — I had actually noted #851 in earlier triage yesterday as the canonical approved fix, and forgot that context when I filed this today. @bensig's comment on #1016 confirms: #851 is approved, MERGEABLE, and additionally addresses #850. Both #1016 and this one are narrower duplicates. Closing this in favor of #851. |
… open PRs, not 8) MemPalace#851 (vnguyen-lexipol) was approved by bensig on 2026-04-15 and already fixes the miner.status() SQLITE_MAX_VARIABLE_NUMBER crash plus MemPalace#850 silent-truncation. I filed MemPalace#1036 this morning missing that context despite having triaged MemPalace#851 the prior day. Closed MemPalace#1036 in favor of MemPalace#851, updated lead paragraph count + Still-ahead row + Open-PRs table accordingly.
Scanned all 233 open upstream PRs today against our open PRs and fork-ahead / planned-work items. Findings merged into README: - P2 (decay) and P3 Tier-0 (LLM rerank): both covered by MemPalace#1032 (@zackchiutw, MERGEABLE, 2026-04-19 — Weibull decay + 4-stage rerank pipeline). Older simpler version at MemPalace#337. Dropped as fork work; watching MemPalace#1032. - P7 (alternative storage): formally out of scope. RFC 001 MemPalace#743 (@igorls) defines the plugin contract; four backend PRs already in flight (MemPalace#700, MemPalace#381 Qdrant; MemPalace#574, MemPalace#575 LanceDB). Fork consumes, does not rebuild. - P0 (multi-label tags): still fork/upstream candidate. MemPalace#1033 (@zackchiutw) ships adjacent privacy-tag + progressive disclosure but not the full multi-label scheme. - Merged MemPalace#1023 section acknowledges complementary MemPalace#976 (felipetruman) which adds broader mine_global_lock() + HNSW num_threads pin. Gives future-us a map so we don't re-file MemPalace#1036-style duplicates.
… (2026-04-22) Ben's batched queue-clear pass merged four PRs at 00:38 UTC: graph cache (MemPalace#661), deterministic hook saves (MemPalace#673), Claude Code 2.1.114 hook stdout + silent_save guard (MemPalace#1021), and upstream's own MemPalace#851 pagination fix (closing MemPalace#1036 as superseded). Moved four rows out of the "Fork Changes" / "Fork change queue" tables into their respective merged-upstream history sections. Intro sentence PR count reduced from 7 → 4 open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #802 (reported 1 week ago) and #1015 (filed today — same crash, same stack location, filed after you merged #999).
The bug
On palaces with more than ~32,766 drawers, ChromaDB's underlying SQLite query binds
totalvariables and exceedsSQLITE_MAX_VARIABLE_NUMBER(default 32,766), raising:mempalace statuscrashes before printing anything on any palace over the limit. The fork has been running this paginated version for 9 days (landed on jphein/mempalace main on 2026-04-10) against a 152,682-drawer palace, so I haven't re-triggered the crash directly in that window — but the unpatched code path is reproducible by reverting this patch.The fix
Each page stays well under the variable cap. Loop terminates when ChromaDB returns fewer than the requested batch (final page).
Also switches printed counts to thousand-separated decimals (
152,682vs152682) — easier to read on large palaces.Test plan
pytest tests/test_miner.py tests/test_cli.py -q— 62 passedstatuscompletes and prints correct per-wing/per-room countsNotes
miner.py::status()only. Same underlying issue may exist in othercol.get(limit=total)call sites; happy to file follow-ups if you confirm you want this scope.mempalace statuscrashes on palaces >~32K drawers (SQLite variable limit) #1015 duplicatesmempalace statuscrashes with "too many SQL variables" on large palaces #802 and the fix is simple enough to merge independently.