Commit 7538299
fix: prevent HNSW index bloat via batch_size + sync_threshold metadata
Sets `hnsw:batch_size` and `hnsw:sync_threshold` to 50_000 on collection
creation in both `get_collection(..., create=True)` and the legacy
`create_collection()` path. Preserves existing `hnsw:space` and
`hnsw:num_threads=1` (race fix from MemPalace#976) and the `**ef_kwargs` plumbing
for embedding-function injection (perf fix from MemPalace#1148/a4868a3).
Without these defaults, mining ~10K+ drawers triggers ~30 HNSW index
resizes and hundreds of persistDirty() calls. persistDirty uses relative
seek positioning in link_lists.bin; accumulated seek drift across resize
cycles causes the OS to extend the sparse file with zero-filled regions,
each cycle compounding the next. Result: link_lists.bin grows into
hundreds of GB sparse, after which `status`, `search`, and `repair` all
segfault and the palace is unrecoverable.
Empirical: rebuilt a palace from scratch on 39,792 drawers across 5
wings with this fix applied. Final palace 376 MB, link_lists.bin stays
at 0 bytes across both Chroma collection dirs, status and search both
return cleanly. Same workload without the fix bloated the palace to
565 GB sparse (30 GB on disk) and segfaulted at ~15K drawers.
Migration note: chromadb treats HNSW config as immutable post-creation,
so existing bloated palaces still need to be nuked and re-mined; this
only protects fresh collections.
Tests assert both keys land on the persisted collection metadata in
both code paths, which also covers the MemPalace#1161 "config silently dropped"
concern at CI time.
Closes MemPalace#344
Supersedes MemPalace#346
Co-authored-by: robot-rocket-science <robot-rocket-science@users.noreply.github.com>1 parent 0d9929c commit 7538299
2 files changed
Lines changed: 69 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
30 | 53 | | |
31 | 54 | | |
32 | 55 | | |
| |||
596 | 619 | | |
597 | 620 | | |
598 | 621 | | |
599 | | - | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
600 | 627 | | |
601 | 628 | | |
602 | 629 | | |
| |||
646 | 673 | | |
647 | 674 | | |
648 | 675 | | |
649 | | - | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
650 | 681 | | |
651 | 682 | | |
652 | 683 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
335 | 335 | | |
336 | 336 | | |
337 | 337 | | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
338 | 374 | | |
339 | 375 | | |
340 | 376 | | |
| |||
0 commit comments