Concurrent writers (hooks + MCP server + CLI) corrupt the palace on chromadb 1.5.8 — sparse-file bloat + SIGSEGV

## Environment
- `mempalace` 3.3.2 (installed via pipx, 2026-04-21 release)
- `chromadb` 1.5.8 (latest)
- Python 3.12.12
- macOS 26.2 (25C56), ARM64 (Apple Silicon / M4)
- Claude Code CLI + Claude desktop both connected to the same palace

## Summary
When MemPalace's Claude Code hooks (`SessionStart`, `Stop`, `PreCompact`) spawn `mempalace mine` processes that run concurrently with an active MCP server (`python -m mempalace.mcp_server`) and/or an explicit CLI `mempalace mine` invocation, the chromadb 1.5.8 HNSW vector segment becomes corrupted. The corruption manifests as:

1. **Sparse-file bloat**: an HNSW segment's `link_lists.bin` reports a small logical size (~300 KB) but allocates hundreds of gigabytes of disk blocks — in my case **362 GB** (337 GiB) allocated for a 302,760-byte file. `du -sh ~/.mempalace/palace/` reported 338 GB.
2. **Hard SIGSEGV**: any subsequent open of the palace by chromadb crashes Python with `EXC_BAD_ACCESS` / `KERN_INVALID_ADDRESS`, because chromadb mmaps the full allocated region and reads run off the end of valid memory.

`mempalace repair` cannot recover because chromadb itself crashes before the repair logic runs, and even after successful surgical recovery + `mempalace repair --yes` (34905 drawers rebuilt), chromadb still segfaults on the next read. Only a full nuke + re-mine fixes it — and the corruption reproduces within minutes of restoring hooks.

## Detection
```sh
stat -f "%N: logical=%z blocks=%b" ~/.mempalace/palace/*/link_lists.bin
du -sh ~/.mempalace/palace/
```

## Root cause (hypothesis)
chromaDB 1.5.8's local HNSW segment writer is not safe under concurrent writers across processes. MemPalace currently allows three concurrent writer paths: (1) hooks spawning `mempalace mine`, (2) MCP server writes, (3) user-invoked CLI `mempalace mine`. No cross-process lock prevents them from touching the HNSW segment simultaneously.

## Suggestions
- Serialize all writers via a cross-process lock (flock on the palace dir).
- Or route all writes through the MCP server (CLI + hooks become MCP clients).
- Add a startup sanity check warning when HNSW `*.bin` has `blocks*512` far > logical size.
- Fsync the HNSW index after each batch so mid-write crashes don't leave sparse garbage.

Happy to share the full Python crash report on request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent writers (hooks + MCP server + CLI) corrupt the palace on chromadb 1.5.8 — sparse-file bloat + SIGSEGV #1092

Environment

Summary

Detection

Root cause (hypothesis)

Suggestions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Concurrent writers (hooks + MCP server + CLI) corrupt the palace on chromadb 1.5.8 — sparse-file bloat + SIGSEGV #1092

Description

Environment

Summary

Detection

Root cause (hypothesis)

Suggestions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions