Skip to content

Commit 9f22d6b

Browse files
committed
docs: polish README — accurate counts, citations, cleaner structure
Verified every factual claim against the live palace and repo. No content removed; all edits preserve existing sections. Accuracy fixes (numbers that drifted or were stale): - Drawer count unified: "134K / 135K+ / 137K+" → "137,949" in the status block and "137K" casually elsewhere. Matches palace today. - Room count: "60+" → "68" (distinct rooms per sqlite). - Auto-memory file count: "~dozens" → "17 files (this project)". - "73-stopword false positives" → "285 English entries and counting", with inline link to mempalace/i18n/en.json. The 73 number was from the pre-i18n era; today the stopword list lives in JSON and has grown to 285. Citation additions (claims that were bare): - Superseded section: upstream Okapi-BM25 now cites MemPalace#789, file-level locking cites MemPalace#784. - "Zep/Graphiti temporal graph model" now links to the getzep/graphiti repo. - Closed-PR flat list converted to inline links. - "Auto Dream feature flag" now uses the qualified anthropics/claude-code#38461 form. Structure (same content, better shape): - "Open problems" → "Active investigations". The sections are hard research questions, not failures; framing should match. - "Two-layer memory architecture" lifted out of Active investigations and promoted to its own top-level section right after Architectural principles. It's foundational, not an open problem. - New "Status at a glance" paragraph under the header pointing at Discussion MemPalace#1017, test count, the upstream-PR queue, and this repo's Issues for fork-specific feedback. - New one-paragraph "what this fork adds" summary above Why-this- fork-exists, so a stranger can understand the differentiator in under ten seconds. 42 README-claim tests still pass.
1 parent 6d244a0 commit 9f22d6b

1 file changed

Lines changed: 26 additions & 22 deletions

File tree

README.md

Lines changed: 26 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,19 @@
1212

1313
---
1414

15-
Fork of [MemPalace v3.3.1](https://github.com/milla-jovovich/mempalace/releases/tag/v3.3.1). Running in production with 135K+ drawers across 60+ rooms. See upstream README for full feature docs.
15+
Fork of [MemPalace v3.3.1](https://github.com/milla-jovovich/mempalace/releases/tag/v3.3.1). Running in production since 2026-04-09 — currently 137,949 drawers across 68 rooms in 22 wings, 8 open PRs upstream. See upstream README for full feature docs.
16+
17+
What this fork adds that you won't get from upstream: a **deterministic silent-save hook architecture** (zero data loss, `systemMessage` notification), **ChromaDB 1.5.x hardening** (`quarantine_stale_hnsw` drift recovery, segfault-trigger guards, 8-site `None`-metadata safety), and **search that never silently misses** (`search_memories` returns warnings + sqlite BM25 top-up + `available_in_scope` so callers can see what they aren't getting). Full list below.
18+
19+
**Status at a glance:** active as of 2026-04-18 · [Discussion #1017](https://github.com/MemPalace/mempalace/discussions/1017) introduces the fork upstream · 998 tests pass on `main` · [Open upstream PRs](#open-upstream-prs) (8) are the contribution pipeline · [Issues on this repo](https://github.com/jphein/mempalace/issues) for fork-specific feedback.
1620

1721
## Why this fork exists
1822

1923
We surveyed the memory-system landscape in April 2026 and found no verbatim-first local system with MCP. Every alternative transforms content on write — extracted facts, knowledge graphs, tiered summaries — losing the original text.
2024

2125
| System | Verbatim? | Local? | MCP? | Notes |
2226
|---|---|---|---|---|
23-
| **MemPalace** | Yes | Yes | Yes | What we have. 135K drawers. |
27+
| **MemPalace** | Yes | Yes | Yes | What we have. 137,949 drawers as of 2026-04-18. |
2428
| Hindsight | No — LLM extracts facts | Yes (Docker) | Yes | Original text is lost. |
2529
| Mem0 / OpenMemory | No — extracts "memories" | Partial | Yes | Cloud-first. |
2630
| Cognee | No — knowledge graph | Yes | No | |
@@ -32,11 +36,11 @@ We surveyed the memory-system landscape in April 2026 and found no verbatim-firs
3236

3337
## Architectural principles
3438

35-
Three principles that emerged from 134K drawers of production use. They explain most of this fork's decisions and should guide future ones. Contributors: use these to evaluate PRs.
39+
Three principles that emerged from 137K drawers of production use. They explain most of this fork's decisions and should guide future ones. Contributors: use these to evaluate PRs.
3640

3741
### 1. Transforms on write are the enemy
3842

39-
Every operation that interprets content at write time is a failure surface. Entity detection misfires. Classifiers force wrong rooms. LLM-extracted "facts" lose nuance and can't be un-extracted. Half of this fork's bugs (`room=None` crashes, 73-stopword false positives, wing misassignment) trace to a single mistake: making classification a *gate* instead of a best-effort enrichment.
43+
Every operation that interprets content at write time is a failure surface. Entity detection misfires. Classifiers force wrong rooms. LLM-extracted "facts" lose nuance and can't be un-extracted. Many of this fork's visible bugs (`room=None` crashes, a stopword list that's grown to [285 English entries and counting](mempalace/i18n/en.json) to paper over false positives, wing misassignment) trace to a single mistake: making classification a *gate* instead of a best-effort enrichment.
4044

4145
Write the raw text. Derive everything else lazily, from unambiguous signals, with a graceful fallback when derivation fails. The verbatim archive is the one thing that must always succeed.
4246

@@ -45,7 +49,7 @@ Write the raw text. Derive everything else lazily, from unambiguous signals, wit
4549
Hierarchy isn't wrong — *mandatory synchronous classification* is wrong. Those are different claims, and conflating them was our earlier mistake.
4650

4751
**Good uses of hierarchy, which we keep:**
48-
- **Browseable scope** for serendipitous recall across 134K drawers. Search answers "when did I hit this error"; browse answers "what was I working on last November."
52+
- **Browseable scope** for serendipitous recall across 137K drawers. Search answers "when did I hit this error"; browse answers "what was I working on last November."
4953
- **Deletion and retention as a unit.** Purging drawers from an abandoned experiment is one operation, not a risky query-then-delete with collateral damage.
5054
- **Disambiguation without query gymnastics.** The same keyword appears in unrelated contexts across years of work. Scope separates them by default.
5155
- **Auto-surfacing priors.** A wing derived from the current working directory is a cheap, unambiguous signal for what to search first. This matters for the open problem below.
@@ -64,6 +68,17 @@ Search quality compounds. Classification quality has a hard ceiling set by the a
6468

6569
Effort spent tuning the entity detector is effort not spent on the thing that actually pays compounding returns.
6670

71+
## Two-layer memory model
72+
73+
Claude Code has two complementary memory layers, used in tandem:
74+
75+
| Layer | Storage | Size | Consolidation | Purpose |
76+
|---|---|---|---|---|
77+
| **Auto-memory** | `~/.claude/projects/*/memory/*.md` | 17 files (this project) | None (manual writes) | Preferences, feedback, context |
78+
| **MemPalace** | `~/.mempalace/palace/` (ChromaDB) | 137K+ drawers | None (write-only archive) | Verbatim conversations, tool output, code |
79+
80+
Neither has automatic consolidation. Claude Code has unreleased "Auto Dream" consolidation code behind a disabled feature flag ([anthropics/claude-code#38461](https://github.com/anthropics/claude-code/issues/38461)) — if it ships, it covers only the lightweight layer. MemPalace decay (P2) and feedback (P3) remain the right priorities for the verbatim archive.
81+
6782
## Fork Changes
6883

6984
What this fork adds beyond upstream v3.3.1.
@@ -106,8 +121,8 @@ What this fork adds beyond upstream v3.3.1.
106121

107122
### Superseded by upstream
108123

109-
- Hybrid keyword fallback (`$contains`) — upstream shipped Okapi-BM25 (60/40 blend)
110-
- Batch ChromaDB writes — upstream has file-level locking for concurrent agents
124+
- Hybrid keyword fallback (`$contains`) — upstream shipped Okapi-BM25 (60/40 blend) via [#789](https://github.com/milla-jovovich/mempalace/pull/789)
125+
- Batch ChromaDB writes — upstream has file-level locking for concurrent agents via [#784](https://github.com/milla-jovovich/mempalace/pull/784)
111126
- Inline transcript mining in hooks — upstream uses `mempalace mine` in background
112127

113128
## Roadmap
@@ -155,7 +170,7 @@ Triples are **derived** from the verbatim archive, not parallel to it. If extrac
155170

156171
### P5 — Temporal fact validity *(1 day, depends on P4)*
157172

158-
KG triples get a context slot (SPOC: subject-predicate-object-context) rather than only `valid_from` / `valid_to` columns. Context acts as a namespace — `(LeBron, played_for, Beavers, "2023_season")` vs `(LeBron, played_for, Lakers, "2022_season")` — making contradiction detection "same S+P, different O, overlapping contexts" rather than timestamp-range logic. On write, close any existing triple with the same subject+predicate+context before opening a new one. Reference: Zep/Graphiti's temporal graph model.
173+
KG triples get a context slot (SPOC: subject-predicate-object-context) rather than only `valid_from` / `valid_to` columns. Context acts as a namespace — `(LeBron, played_for, Beavers, "2023_season")` vs `(LeBron, played_for, Lakers, "2022_season")` — making contradiction detection "same S+P, different O, overlapping contexts" rather than timestamp-range logic. On write, close any existing triple with the same subject+predicate+context before opening a new one. Reference: Zep's [Graphiti](https://github.com/getzep/graphiti) temporal graph model.
159174

160175
### P6 — Input sanitization on writes *(half day)*
161176

@@ -165,12 +180,12 @@ Strip known injection patterns (role-play instructions, "ignore previous instruc
165180

166181
- **AAAK work** — upstream's problem; we store verbatim.
167182
- **Expanding hierarchy types** (tunnels, closets, new room categories). Adding more categories doesn't address the write-time classification problem. Tags (P0) and derived scope (P1) do.
168-
- **Benchmark work** — our value is "134K drawers of verbatim local history with fast search," not upstream's LongMemEval score.
183+
- **Benchmark work** — our value is "137K drawers of verbatim local history with fast search," not upstream's LongMemEval score.
169184
- **Full architecture rewrite** — not worth the migration cost.
170185
- **Dual-granularity ANN, dream engine, foresight signals**[Karta](https://github.com/rohithzr/karta)-inspired features that require LLM calls on every write. Our zero-LLM philosophy makes these opt-in at best.
171186
- **FTS5 parallel index** — right idea (engram proves it), but significant infrastructure alongside ChromaDB. Revisit after tags and decay are proven.
172187

173-
## Open problems
188+
## Active investigations
174189

175190
### Auto-surfacing context Claude doesn't know to ask for
176191

@@ -193,17 +208,6 @@ Tools and patterns we're evaluating for the two open problems above. Not competi
193208
- [**Mintlify**](https://www.mintlify.com/) — docs platform pitched as "self-updating knowledge management," with MCP and `llms.txt` support for AI-consumable docs. Useful reference for the stale-docs problem: their agent-driven update model is one approach to keeping auto-loaded context fresh. Cloud-hosted, so not a drop-in for local palaces, but the surface area (what they expose to AI, how they structure agent-readable docs) is worth studying.
194209
- [**Context engineering (Emmimal P Alexander)**](https://towardsdatascience.com/rag-isnt-enough-i-built-the-missing-context-layer-that-makes-llm-systems-work/) — argues the bottleneck isn't retrieval but *what actually enters the context window*. Five components: hybrid retrieval, re-ranking with domain weighting, memory with exponential decay, intelligent compression, token-budget enforcement. The reference implementation is [context-engine](https://github.com/Emmimal/context-engine), already cited for P2 decay. The article frames the auto-surfacing problem as an engineering discipline rather than a product feature — useful scaffolding for the open problem above.
195210

196-
### Two-layer memory architecture
197-
198-
Claude Code has two complementary memory layers, used in tandem:
199-
200-
| Layer | Storage | Size | Consolidation | Purpose |
201-
|---|---|---|---|---|
202-
| **Auto-memory** | `~/.claude/projects/*/memory/*.md` | ~dozens of files | None (manual writes) | Preferences, feedback, context |
203-
| **MemPalace** | `~/.mempalace/palace/` (ChromaDB) | 137K+ drawers | None (write-only archive) | Verbatim conversations, tool output, code |
204-
205-
Neither has automatic consolidation. Claude Code has unreleased "Auto Dream" consolidation code behind a disabled feature flag ([#38461](https://github.com/anthropics/claude-code/issues/38461)) — if it ships, it covers only the lightweight layer. MemPalace decay (P2) and feedback (P3) remain the right priorities for the verbatim archive.
206-
207211
## Open upstream PRs
208212

209213
| PR | Status | Description |
@@ -217,7 +221,7 @@ Neither has automatic consolidation. Claude Code has unreleased "Auto Dream" con
217221
| [#1000](https://github.com/milla-jovovich/mempalace/pull/1000) | `MERGEABLE`, closes #823, Copilot nit addressed, rebased onto #995's new backend surface | `quarantine_stale_hnsw()` for HNSW/sqlite drift crashes |
218222
| [#1005](https://github.com/milla-jovovich/mempalace/pull/1005) | `MERGEABLE`, Copilot review addressed (authoritative scope count via paginated `col.get`, gate "run repair" on vector underdelivery, restore palace path in CLI error) | Warnings + sqlite BM25 top-up — never silently return fewer results than scope contains |
219223

220-
Closed: #626, #633, #662 (superseded by BM25), #663 (upstream wrote #757), #738 (docs stale), #629 (superseded — upstream shipped batching + file locking), #632 (superseded — `--version`, `purge`, `repair` all shipped in v3.3.0).
224+
Closed: [#626](https://github.com/milla-jovovich/mempalace/pull/626), [#633](https://github.com/milla-jovovich/mempalace/pull/633), [#662](https://github.com/milla-jovovich/mempalace/pull/662) (superseded by BM25), [#663](https://github.com/milla-jovovich/mempalace/pull/663) (upstream wrote [#757](https://github.com/milla-jovovich/mempalace/pull/757)), [#738](https://github.com/milla-jovovich/mempalace/pull/738) (docs stale), [#629](https://github.com/milla-jovovich/mempalace/pull/629) (superseded — upstream shipped batching + file locking), [#632](https://github.com/milla-jovovich/mempalace/pull/632) (superseded — `--version`, `purge`, `repair` all shipped in v3.3.0).
221225

222226
## Setup
223227

0 commit comments

Comments
 (0)