We've verified the following 3 patches work in production on Windows 11 / Python 3.12 / MemPalace 3.3.2 / ChromaDB 1.5.8. Sharing here so maintainers can consider merging, and others can self-apply while waiting for official fixes.
Fix 1: HNSW explosion prevention (related to #1091)
File: mempalace/backends/chroma.py
The root cause is ChromaDB's compactor rebuilding HNSW graphs without bound on large collections. We add two pre-write safety guards.
A. Add import after import chromadb:
import chromadb
import shutil as _mp_shutil
B. Add helper functions before def quarantine_stale_hnsw:
def _check_disk_space(palace_path: str, required_mb: int = 500):
"""Abort writes when free disk space is below threshold. Prevents OOM/disk-fill scenarios."""
try:
_, _, free = _mp_shutil.disk_usage(palace_path)
except OSError:
return
free_mb = free // (1024 * 1024)
if free_mb < required_mb:
raise RuntimeError(
f"Disk space critically low: {palace_path} has {free_mb}MB free "
f"(threshold: {required_mb}MB). Write aborted to prevent HNSW explosion."
)
def _check_segment_size(palace_path: str, max_mb_per_segment: int = 500):
"""Detect HNSW segment bloat (link_lists.bin > threshold). Issue #1091 describes
a case where a single segment's link_lists.bin grew to 582GB. This catches it
before the next write rather than after disk exhaustion."""
try:
for entry in os.scandir(palace_path):
if not entry.is_dir():
continue
for f in os.scandir(entry.path):
if f.name == "link_lists.bin":
size_mb = f.stat().st_size // (1024 * 1024)
if size_mb > max_mb_per_segment:
raise RuntimeError(
f"HNSW segment bloat detected: {f.path} = {size_mb}MB "
f"(threshold: {max_mb_per_segment}MB). Run: mempalace repair "
"or manually delete the oversized segment directory."
)
except OSError:
pass
C. In ChromaCollection.add() and ChromaCollection.upsert(), add at the top:
settings = self._collection._client.get_settings()
palace_path = settings.persist_directory
_check_disk_space(palace_path, required_mb=500)
_check_segment_size(palace_path, max_mb_per_segment=500)
Fix 2: mempalace status crash on large palaces (fixes #1098)
File: mempalace/miner.py — status() function
The col.get(limit=total) call is unbounded. SQLite has a bind-variable ceiling (~999 or 32766 depending on build). With >30K drawers the query fails with too many SQL variables.
Replace:
total = col.count()
r = col.get(limit=total, include=["metadatas"]) if total else {"metadatas": []}
metas = r["metadatas"]
With:
total = col.count()
metas = []
if total:
BATCH = 5000
for offset in range(0, total, BATCH):
r = col.get(limit=BATCH, offset=offset, include=["metadatas"])
metas.extend(r["metadatas"] or [])
Note: #66 added batching elsewhere in the codebase but this location was missed and regressed.
Fix 3: silent_save config flag ignored (fixes #854)
File: mempalace/hooks_cli.py
MempalaceConfig.hook_silent_save is stored correctly via the MCP tool, but hook_stop() never reads it — the block fires regardless of the setting.
A. Add import at top:
from .config import MempalaceConfig
B. In hook_stop(), after if since_last >= SAVE_INTERVAL and exchange_count > 0: add:
cfg = MempalaceConfig()
if cfg.hook_silent_save:
_output({})
return
Environment
- Windows 11 Pro (build 26200)
- Python 3.12.10
- MemPalace 3.3.2
- ChromaDB 1.5.8
All three fixes tested and working locally.
We've verified the following 3 patches work in production on Windows 11 / Python 3.12 / MemPalace 3.3.2 / ChromaDB 1.5.8. Sharing here so maintainers can consider merging, and others can self-apply while waiting for official fixes.
Fix 1: HNSW explosion prevention (related to #1091)
File:
mempalace/backends/chroma.pyThe root cause is ChromaDB's compactor rebuilding HNSW graphs without bound on large collections. We add two pre-write safety guards.
A. Add import after
import chromadb:B. Add helper functions before
def quarantine_stale_hnsw:C. In
ChromaCollection.add()andChromaCollection.upsert(), add at the top:Fix 2:
mempalace statuscrash on large palaces (fixes #1098)File:
mempalace/miner.py—status()functionThe
col.get(limit=total)call is unbounded. SQLite has a bind-variable ceiling (~999 or 32766 depending on build). With >30K drawers the query fails withtoo many SQL variables.Replace:
With:
Note:
#66added batching elsewhere in the codebase but this location was missed and regressed.Fix 3:
silent_saveconfig flag ignored (fixes #854)File:
mempalace/hooks_cli.pyMempalaceConfig.hook_silent_saveis stored correctly via the MCP tool, buthook_stop()never reads it — the block fires regardless of the setting.A. Add import at top:
B. In
hook_stop(), afterif since_last >= SAVE_INTERVAL and exchange_count > 0:add:Environment
All three fixes tested and working locally.