Skip to content

Commit 7a3ee80

Browse files
jpheinclaude
andcommitted
merge: upstream/develop (2026-04-25) — v3.3.4 features + MemPalace#976 HNSW fixes
Bring in 29 commits from upstream/develop since the last merge (2026-04-23): Major absorbed changes: - MemPalace#976 (Felipe Truman): HNSW graph corruption fix, mine_global_lock for fan-out, MAX_PRECOMPACT_BLOCK_ATTEMPTS=2 for /compact deadlocks. Closes MemPalace#974/MemPalace#965/MemPalace#955; likely resolves MemPalace#1172 too. - MemPalace#1179 (Igor): CLI mempalace search routes through _hybrid_rank, legacy-metric warning + _warn_if_legacy_metric, invariant tests on hnsw:space=cosine across all 5 collection-creation paths. - MemPalace#1180/MemPalace#1183/MemPalace#1184: cross-wing topic tunnels, init mine UX, --auto-mine. - MemPalace#1185 (perf/batched-upsert-gpu): batched ChromaDB upserts, GPU device detection via mempalace/embedding.py. - MemPalace#1182: graceful Ctrl-C during mempalace mine. - MemPalace#1168: tunnel permissions security fix. Conflict resolutions (8 files): - searcher.py: kept fork's CLI delegation through search_memories (cleaner single-source-of-truth path); upstream's inline-retrieval branch dropped. Merged upstream's print formatting (cosine= + bm25=) with fork's matched_via reporting from sqlite BM25 fallback. - backends/chroma.py: kept fork's _BLOB_FIX_MARKER + palace_path arg to ChromaCollection (MemPalace#1171 write lock); merged upstream's **ef_kwargs (embedding_function support from MemPalace#1185). Removed duplicate _pin_hnsw_threads (fork already cherry-picked Felipe's earlier). - mcp_server.py: kept fork's palace_path arg + upstream's clearer comment on hnsw:num_threads=1 rationale. - miner.py: took upstream's serial mine() flow (mine_global_lock + Ctrl-C), RESTORED fork's strict detect_room — substring matching from upstream breaks fork-only test_detect_room_priority1_no_substring_match. Added `import re` for word-boundary regex matching. Fork-ahead concurrent mining (workers=, ThreadPoolExecutor from 5cd14bd) is dropped — daemon migration deprioritizes local concurrent mining; can re-port if needed. - CHANGELOG.md: combined fork's segfault-trio narrative with upstream's v3.3.4 release notes. - HOOKS_TUTORIAL.md: took MEMPALACE_PYTHON env var name (fork README was stale; hooks already use this name per fork-ahead MemPalace#19). - test_convo_miner_unit.py: took both contextlib + pytest imports. - test_searcher.py: imported upstream's 3 new TestSearchCLI tests but skipped them with TODOs — they assume upstream's inline-retrieval CLI with simpler mocks. Rewrite needed for fork's delegated search_memories path (sqlite BM25 fallback + scope counting). Test result: 1334 passed, 3 skipped (the upstream tests above), 3 warnings. Implications for open fork PRs: - MemPalace#1171 (cross-process write lock at adapter) becomes structurally redundant given MemPalace#976's mine_global_lock + daemon-strict serialization. Slated for close with thank-you to Felipe. - MemPalace#1173/MemPalace#1177 still relevant (quarantine threshold + blob_seq marker). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 parents e22484e + 0d9929c commit 7a3ee80

36 files changed

Lines changed: 3024 additions & 283 deletions

.github/workflows/ci.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ jobs:
1717
- uses: actions/setup-python@v6
1818
with:
1919
python-version: ${{ matrix.python-version }}
20+
cache: 'pip'
2021
- run: pip install -e ".[dev]"
2122
- run: python -m pytest tests/ -v --ignore=tests/benchmarks --cov=mempalace --cov-report=term-missing --cov-fail-under=80 --durations=10
2223

@@ -26,7 +27,8 @@ jobs:
2627
- uses: actions/checkout@v6
2728
- uses: actions/setup-python@v6
2829
with:
29-
python-version: "3.9"
30+
python-version: "3.11"
31+
cache: 'pip'
3032
- run: pip install -e ".[dev]"
3133
- run: python -m pytest tests/ -v --ignore=tests/benchmarks --cov=mempalace --cov-report=term-missing --cov-fail-under=80 --durations=10
3234

@@ -36,7 +38,8 @@ jobs:
3638
- uses: actions/checkout@v6
3739
- uses: actions/setup-python@v6
3840
with:
39-
python-version: "3.9"
41+
python-version: "3.11"
42+
cache: 'pip'
4043
- run: pip install -e ".[dev]"
4144
- run: python -m pytest tests/ -v --ignore=tests/benchmarks --cov=mempalace --cov-report=term-missing --cov-fail-under=80 --durations=10
4245
lint:
@@ -46,6 +49,7 @@ jobs:
4649
- uses: actions/setup-python@v6
4750
with:
4851
python-version: "3.11"
52+
cache: 'pip'
4953
- run: pip install "ruff>=0.4.0,<0.5"
5054
- run: ruff check .
5155
- run: ruff format --check .

.github/workflows/deploy-docs.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
permissions:
2424
contents: read
2525
steps:
26-
- uses: actions/checkout@v4
26+
- uses: actions/checkout@v6
2727
with:
2828
fetch-depth: 0
2929

@@ -46,7 +46,7 @@ jobs:
4646
DOCS_EDIT_BRANCH: ${{ github.ref_name }}
4747
run: bun run docs:build
4848

49-
- uses: actions/upload-pages-artifact@v3
49+
- uses: actions/upload-pages-artifact@v5
5050
with:
5151
path: website/.vitepress/dist
5252

@@ -63,4 +63,4 @@ jobs:
6363
steps:
6464
- name: Deploy to GitHub Pages
6565
id: deployment
66-
uses: actions/deploy-pages@v4
66+
uses: actions/deploy-pages@v5

.github/workflows/version-guard.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
check-versions:
1717
runs-on: ubuntu-latest
1818
steps:
19-
- uses: actions/checkout@v4
19+
- uses: actions/checkout@v6
2020

2121
- name: Extract versions from all sources
2222
id: versions

CHANGELOG.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,30 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
88

99
## [3.3.4] — 2026-04-24
1010

11-
### Bug Fixes — ChromaDB 1.5.x segfault trio
11+
### Added (from upstream develop, merged 2026-04-25)
1212

13-
Three independent triggers were crashing fresh processes (MCP servers, stop hooks, `mempalace mine` subprocesses) with SIGSEGV in `chromadb_rust_bindings`. Fixed and filed as [#1171](https://github.com/milla-jovovich/mempalace/pull/1171), [#1173](https://github.com/milla-jovovich/mempalace/pull/1173), [#1177](https://github.com/milla-jovovich/mempalace/pull/1177).
13+
- **`mempalace init` now prompts to mine the same directory.** After entity confirmation, room detection, and gitignore guard, `init` shows a one-line scope estimate (e.g. `~423 files (~12 MB) would be mined into this palace.`) computed from its existing corpus walk, then asks `Mine this directory now? [Y/n]` (default yes) and runs `mine()` in-process if accepted. The estimate fires before the prompt so users on a real corpus aren't surprised by a minutes-long ChromaDB write. Declining prints the exact `mempalace mine <dir>` command for later. (#1181)
14+
- **New `--auto-mine` flag on `mempalace init`** for the non-interactive path (`mempalace init --auto-mine <dir>` skips the mine prompt and runs mine directly). `--yes` retains its existing scope of entity auto-accept only and still prompts for the mine step. (#1181)
15+
- **Cross-wing topic tunnels.** When two wings have confirmed `TOPIC` labels in common, the miner drops a symmetric tunnel between them at mine time. Topic tunnels are stored under a synthetic `topic:<name>` room and tagged with `kind: "topic"`. Threshold is configurable via `MEMPALACE_TOPIC_TUNNEL_MIN_COUNT` env var or `topic_tunnel_min_count` in `~/.mempalace/config.json` (default `1`). (#1180)
16+
- **HNSW graph corruption + PreCompact deadlock + mine fan-out fixes** (#976, Felipe Truman): pins `hnsw:num_threads=1` on collection creation (matches our fork's earlier cherry-pick `552d0d5`), adds `mine_global_lock()` to collapse concurrent `mempalace mine` runs, and caps `MAX_PRECOMPACT_BLOCK_ATTEMPTS=2` so `/compact` can proceed after repeated blocks. Closes #974, #965, #955. Likely also resolves #1172 (PreCompact unconditionally blocking compact).
1417

15-
- **Backend-seam write lock.** `_palace_write_lock(palace_path)` wraps `ChromaCollection.add/upsert/update/delete` using `fcntl.flock(LOCK_EX)` on `$palace/.write.lock`. Because RFC 001 made the adapter the single boundary for all ChromaDB writes, putting the lock there covers every caller (`mcp_server`, `miner`, `convo_miner`, `palace`) automatically. First version of this fix lived in `mcp_server.py` but missed the `mempalace mine` subprocess; moved to the adapter. `flock` auto-releases on process death so crashes can't deadlock.
16-
- **Quarantine on every `make_client()`.** Upstream's `quarantine_stale_hnsw()` only ran at MCP server startup (via #1062). Fork now calls it inside `ChromaBackend.make_client()` itself, so every fresh process (hook, CLI, tests) opens a clean palace. Default threshold lowered 3600 → 300s after a 0.96h-drift segfault in production.
17-
- **`.blob_seq_ids_migrated` marker guard.** Opening `chroma.sqlite3` via Python's `sqlite3.connect()` against a live ChromaDB 1.5.x WAL database leaves state that segfaults the next `PersistentClient`. `_fix_blob_seq_ids()` now writes a sentinel file after first successful migration; subsequent opens short-circuit before touching sqlite. Closes #1090. (Restoration of a guard lost in an earlier upstream merge.)
18+
### Bug Fixes (from upstream develop)
19+
20+
- **CLI `mempalace search` retrieval quality.** Wired the CLI through the same `_hybrid_rank` the `mempalace_search` MCP tool used, and surfaced both `cosine=` and `bm25=` scores in the output. MCP search was unaffected; this fixes the human-facing CLI parity gap.
21+
- **Legacy-palace distance-metric warning.** CLI search now detects palaces created before `hnsw:space=cosine` was consistently set and prints a one-line notice pointing at `mempalace repair`. (#1179)
22+
- **Graceful Ctrl-C during `mempalace mine`.** Interrupting a long mine no longer dumps a multi-frame traceback. (#1182)
23+
24+
### Bug Fixes — fork-ahead ChromaDB 1.5.x segfault work
25+
26+
Three independent SIGSEGV triggers in `chromadb_rust_bindings` filed as [#1171](https://github.com/MemPalace/mempalace/pull/1171), [#1173](https://github.com/MemPalace/mempalace/pull/1173), [#1177](https://github.com/MemPalace/mempalace/pull/1177). After 2026-04-25 upstream merge: #1171 is structurally redundant given Felipe's `mine_global_lock` (#976) plus the fork's daemon serialization, slated for close.
27+
28+
- **Backend-seam write lock.** `_palace_write_lock(palace_path)` wraps `ChromaCollection.add/upsert/update/delete` using `fcntl.flock(LOCK_EX)` on `$palace/.write.lock`. (Slated for close after #976 / daemon-strict obsoletes it.)
29+
- **Quarantine on every `make_client()`.** Fork calls `quarantine_stale_hnsw()` inside `ChromaBackend.make_client()` itself, default threshold 300s. (#1173)
30+
- **`.blob_seq_ids_migrated` marker guard.** `_fix_blob_seq_ids()` now writes a sentinel file after first successful migration; subsequent opens short-circuit before touching sqlite. Closes #1090. (#1177)
31+
32+
### Bug Fixes — fork-ahead BM25 search
33+
34+
- **`_tokenize` None-document guard** (commit `a3a7132`, PR [#1198](https://github.com/MemPalace/mempalace/pull/1198)). `searcher._tokenize` short-circuits to `[]` when document text is `None`, preventing `AttributeError` during `_hybrid_rank → _bm25_scores → _tokenize`. Observed in production daemon log on 2026-04-24. Closes the gap left by upstream's #999 None-metadata audit, which covered metadata read loops but not BM25 helpers.
1835

1936
---
2037

0 commit comments

Comments
 (0)