refactor(searcher): hoist CLOSET_RANK_BOOSTS to module level + record ablation finding#1378
Conversation
A/B ablation 2026-04-27 against the 151K canonical palace (12-probe set mixing recent fork-side decisions with mined-file content). Closet boost fires on ~20% of result rows, concentrated in queries with answers in mined files; sparse on chat-transcript queries. When the boost fired, it re-ordered chunks within a single source file rather than displacing right answers with wrong ones. VecRecall's critique (MemPalace#1129 — "organization-layer involvement in retrieval reduces R@5") did not reproduce on this corpus. The hybrid degrades to effectively pure-vector for transcript queries and re-ranks within-file chunks for mined-file queries, neither of which matches the failure mode VecRecall is fixing. Captured as a comment next to the constants so future-us doesn't have to re-run the experiment to learn the boost is mostly inert on chat-heavy corpora. Refs: scratch experiment was /tmp/closet-boost-ab.py (not committed, re-creatable from this comment + the hoist in f558d3c).
There was a problem hiding this comment.
Pull request overview
This PR refactors the closet-boost tuning knobs in search_memories() by hoisting CLOSET_RANK_BOOSTS and CLOSET_DISTANCE_CAP to module scope, and adds an in-code comment capturing an empirical ablation result relevant to ongoing retrieval-quality discussion (#1129). The intent is to keep runtime behavior unchanged while making the boost parameters patchable for benchmarking/tuning.
Changes:
- Hoist
CLOSET_RANK_BOOSTSandCLOSET_DISTANCE_CAPfrom insidesearch_memories()to module-level constants. - Add a module-level comment block documenting an A/B ablation on a large corpus and summarizing observed impact.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| CLOSET_RANK_BOOSTS = [0.40, 0.25, 0.15, 0.08, 0.04] | ||
| CLOSET_DISTANCE_CAP = 1.5 # cosine dist > 1.5 = too weak to use as signal |
There was a problem hiding this comment.
Good catch — fixed in 1474699. CLOSET_RANK_BOOSTS is now a tuple, so a stray append/__setitem__ on the imported attribute can't silently reshape global ranking. The ablation workflow that motivated the hoist still works because it rebinds the module attribute (searcher.CLOSET_RANK_BOOSTS = (0,) * 5) rather than mutating in place — see scripts/closet_boost_ablation.py.
| # Empirical note (A/B ablation 2026-04-27 on the 151K canonical palace, | ||
| # 12-probe set covering recent fork-side work + transcript content): | ||
| # boost fires on ~20% of result rows, concentrated in queries whose | ||
| # answer lives in mined files; closets are sparse on chat-transcript | ||
| # queries (most fork-side decisions). When the boost did fire, it | ||
| # re-ordered chunks within a single source file rather than displacing | ||
| # right answers with wrong ones — i.e., VecRecall's critique | ||
| # (https://github.com/MemPalace/mempalace/discussions/1129, "org-layer | ||
| # in retrieval path drops R@5") didn't reproduce here. Kept as a | ||
| # rare-but-cheap signal; ablation script lived in /tmp, not committed. |
There was a problem hiding this comment.
Fair — fixed in 1474699 by committing scripts/closet_boost_ablation.py. Takes a palace path + optional probe set on stdin, runs the search under default boosts and zero boosts, prints a per-query delta showing whether the boost re-ordered candidates or just nudged. Lazy-imports the searcher module so --help is usable without chromadb. Comment block now points at the committed script.
…on reproducer Two corrections from @copilot-pull-request-reviewer on MemPalace#1378: 1. CLOSET_RANK_BOOSTS was tagged "constant" but typed as a mutable list — a downstream import that did ``searcher.CLOSET_RANK_BOOSTS.append(...)`` would have silently reshaped global ranking. Switched to a tuple. The ablation workflow that motivated the hoist still works: callers patch at the module-attribute level (``searcher.CLOSET_RANK_BOOSTS = (...)``) rather than mutating in place. 2. The empirical comment claimed "ablation script lived in /tmp, not committed" — which made the finding unreproducible by anyone else. Committed scripts/closet_boost_ablation.py: takes a palace path + optional probe set on stdin, runs the search under default boosts and zero boosts, reports per-query whether the boost re-ordered or just nudged. Lazy imports the searcher module so --help works without chromadb installed. Comment updated to point at the committed script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI lint reformat-check caught the new script. No semantic change.
|
@igorls — friendly nudge: this one is small (+21 / -6 in |
Summary
CLOSET_RANK_BOOSTSandCLOSET_DISTANCE_CAPfrom insidesearch_memoriesto module scope so they're patchable for A/B benchmarking without touching the function.Pure refactor + comment — no behavior change.
Why
The rank-based closet-boost has been the subject of #1129 (VecRecall: "organization-layer involvement in retrieval reduces R@5"). Before deciding whether to act on that critique, I ran a 12-probe A/B against my 151K-drawer canonical palace, default-vs-zeroed boosts. Findings (now captured in code):
The hoist itself is benign and useful — it lets future tuners swap the constants from outside the function (env var, config flag, in-process patch). The ablation comment lives next to the constants so future-us doesn't have to re-run the experiment.
Scope
mempalace/searcher.py: +21 / -6Provenance
This was filed in response to @igorls's invitation in his close of `#1286` to file the closet-boost refactor as a small standalone PR. Two commits, surgical, against current
develop.🤖 Generated with Claude Code