All notable changes to MemPalace are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Multi-language entity detection — lexical patterns (person verbs, pronouns, dialogue markers, project verbs, stopwords, candidate character classes) now live in the optional entity section of each locale JSON under mempalace/i18n/<lang>.json. Every public function in entity_detector accepts a languages= tuple and unions patterns across enabled locales. Default stays ("en",) so existing English-only callers are unchanged. (#911)
- Five new fully-supported locales with CLI strings, AAAK compression instructions, and entity-detection patterns:
- Brazilian Portuguese
pt-br(#156) - Russian
ru(#760) - Italian
it(#907) - Hindi
hi(#773) - Indonesian
id(#778)
- Brazilian Portuguese
MempalaceConfig.entity_languages— persistent palace-level language selection;MEMPALACE_ENTITY_LANGUAGESenv override;mempalace init --lang en,pt-brflag that saves to~/.mempalace/config.json(#911)- Per-language
candidate_pattern— non-Latin scripts register their own character class, so names likeJoão,Инна,राजare no longer silently dropped by the ASCII-only default (#911) - VSCode devcontainer matching the CI environment (#881)
MEMPAL_VERBOSEenv toggle — developers see diaries surfaced in chat while the default remains silent (#871)created_attimestamps included in search results (#846)
i18n / Unicode
- Script-aware word boundaries for combining-mark scripts — Python's
\bfails on Devanagari vowel signs (ा ी ु), Arabic, Hebrew, Thai, Tamil, Khmer etc., truncating names likeअनीता→अनीतand making person-verb patterns never fire. Locales now declare an optionalboundary_charsfield and the i18n loader expands\binto a script-aware lookaround boundary (#932) - Case-insensitive BCP 47 language code resolution —
--lang PT-BR,zh-cn,Pt-Brpreviously fell through to English silently; now resolve to the canonical locale file via lowercase matching, with the entity-pattern cache keyed on the canonical form so casing variations share one cache entry (#928) - Wire i18n candidate patterns into
miner._extract_entities_for_metadata(),palace.build_closet_lines(), andentity_registry.extract_unknown_candidates()— three code paths that still hardcoded ASCII-only[A-Z][a-z]{2,}and silently missed Cyrillic, accented Latin, and non-Latin entity metadata tags (#931) - Explicit
encoding="utf-8"onPath.read_text()calls across entity_registry, instructions_cli, split_mega_files, and onboarding tests — prevents Windows GBK (and other non-UTF-8) locales from corrupting UTF-8 files (#946, #776) ko.jsonstatus_drawersused{drawers}instead of{count}, showing the raw template string instead of the number (#758)- Move
test_i18n.pyfrom inside the installed package intotests/so pytest actually collects it; remove thesys.path.inserthack (#758) Dialect.from_config()defaulted tocurrent_lang()(module-global) when config had nolangkey — replaced with explicit"en"fallback for determinism (#758)
Other
- Guard
KnowledgeGraph.close()andquery_relationship/timeline/statsmethods with the instance lock to prevent concurrent-access corruption (#887, #884) - Replace invalid
{"decision": "allow"}with{}in hook responses — the string wasn't a valid decision value and triggered schema warnings (#885) entity_registry.research()defaults to local-only — previously made outbound Wikipedia HTTPS requests without explicit user opt-in; callers now must passallow_network=True(#811)- Precompact hook no longer blocks compaction when it fails or takes too long (#856, #858, #863)
- Redirect stdout to stderr during MCP server import so library logging can't corrupt the JSON-RPC channel (#225, #864)
mempalace initauto-adds per-project files to.gitignorein git repositories so users don't accidentally commitmempalace.yaml/entities.json(#185, #866)- Searcher guards against empty ChromaDB query results that previously raised on edge-case corpora (#195, #865)
- Return empty status instead of an error on a cold-start palace with no drawers yet (#830, #831)
- Restrict file permissions on sensitive palace data (#814)
- Slack transcript importer writes a provenance header and preserves speaker IDs (#815)
- Allow
mempalace mineto run in directories without a localmempalace.yamland surface the missing-yaml warning on stderr (#604) - Security hook injection fix (#812)
- Save hook auto-mines transcripts even when
MEMPAL_DIRis unset (#840) - Pin the Pages custom domain via a shipped
CNAMEin the deploy artifact (#877) - Version drift safeguard — sync pyproject +
version.py+ README badge in one place (#876) - Deploy docs workflow now runs on
developonly, preventing accidental main-branch deploys (#845)
- Regex compilation optimization for entity extraction — pre-compile per-entity pattern sets once and cache by
(name, languages)tuple, so multi-language callers don't thrash the cache (#880) - Knowledge-graph value sanitization now preserves natural punctuation (commas, colons, parentheses) that commonly appears in KG subject/object values (#873)
- Clarify that
mempalace initrequires a<dir>argument in CLI help text (#210, #862) - Domain name and specific impostor sites called out in the scam-alert section (#869)
- Tightened
SECURITY.mdwith a real version-support policy and the GHPVR-only reporting channel (#810) - Fixed stale
pyproject.tomlURLs (#853) - v4 planning prep (#852)
palace_graphtunnel helper test coverage (#908)
- Closet layer — a compact searchable index of pointers to verbatim drawers, enabling fast topical lookup without reading all content (#788)
- BM25 hybrid search — closets boost ranking, drawers remain the source of truth (#795, #829)
- Entity metadata on every drawer for filterable search (#829)
- Diary ingest — day-based rooms for conversation transcripts (#829)
- Cross-wing tunnels — explicit links between rooms in different wings for multi-project agents (#829)
- Drawer-grep — returns the best-matching chunk plus adjacent context drawers (#829)
- Offline fact checker against the entity registry and knowledge graph (#829)
- LLM-based closet regeneration — optional, bring-your-own endpoint, no mandatory API key (#793)
- Hall detection — routes drawer content to
emotions/technical/family/memory/identity/consciousness/creativehalls, enabling hall-based graph connectivity within wings (#835)
- Set
hnsw:space=cosinemetadata on all collection creation sites — fixes broken similarity scoring under ChromaDB's default L2 distance (#807, #218) - File-level locking prevents duplicate drawers when agents mine the same file concurrently (#784, #826)
- Hybrid closet+drawer retrieval — closets boost ranking, never gate results (#795)
- Stop hooks from making agents write in chat — saves tokens on every turn (#786)
- Strip system tags, hook output, and Claude UI chrome from drawers before filing (#785)
- Verbatim-safe
strip_noisescoped to Claude Code JSONL only (#785) - Prevent diary entry ID collisions via microsecond timestamp and full content hash (#819)
- Auto-rebuild stale drawers via
NORMALIZE_VERSIONschema gate - Enforce atomic topics in closets and extract richer pointers
- Sync
version.pyto matchpyproject.toml(#820) - Remove unused
mainimport frommempalace/__init__.py(#827) - README audit — fix 7 stale claims (tool count, version badge, wake-up token cost,
dialect.pylossless disclaimer,pyproject.tomlversion) with 42 regression-guard tests (#835)
- Optimize entity detection with regex caching and pre-compilation (#828)
- Extract locked filing block into helper to keep
mine_convosunder C901 complexity
- Add
docs/CLOSETS.md— closet layer overview - Fix stale
milla-jovovich/*org URLs in website and plugin manifests (#787) - Fix remaining stale org URLs in contributor docs (#808)
- Rewrite
README.mdandmempalaceofficial.combenchmark pages to remove category-error cross-system comparisons (R@5 retrieval recall had been listed next to competitor QA accuracy under one column), remove the retracted "+34% palace boost" claim from the surfaces where it had remained, replace the100%Haiku-rerank headline with the honest held-out98.4%R@5, drop the LoCoMo100%top-50 row (retrieval-bypass artefact), and fix the brokenaya-thekeeper/mempalreproduction URL (#875) - Add
docs/HISTORY.mdas the canonical home for corrections, retractions, and public notices; move the 2026-04-07 "Note from Milla & Ben" and the 2026-04-11 impostor-domain notice out ofREADME.md - Add v3.3.0 reproduction result JSONLs and the deterministic
seed=4250/450 LongMemEval split underbenchmarks/— every BENCHMARKS.md claim reproduces exactly
- Add test coverage for
mine_lock, closets, entity metadata, BM25, and diary - Verify
mine_lockvia disjoint critical-section intervals - Serialize
mine_lockconcurrency test with multiprocessing - Make diary state path assertion platform-neutral
- Add
TestTunnelscoverage for cross-wing tunnel operations - Ruff format with CI-pinned version (0.4.x); format
mempalace/palace.py
3.2.0 — 2026-04-12
- Remove
chromadb<0.7upper bound — unblocks installs against chromadb 1.x palaces (#690) - Bump version to 3.2.0 across
pyproject.toml,mempalace/version.py, README badge, and OpenClaw SKILL (#761)
- Harden palace deletion, WAL redaction, and MCP search input handling (#739)
- Consistent input validation, argument whitelisting, concurrency safety, and WAL fixes (#647)
- Remove hardcoded credential paths from benchmark runners (#177)
- Remove global SSL verification bypass in convomem_bench (#176)
- Parse Claude.ai privacy export with
messageskey and sender field (#685, #677) - Detect mtime changes in
_get_clientto prevent stale HNSW index (#757) - Hash full content in
tool_add_drawerdrawer ID — stable re-mines (#716) - Remove 10k drawer cap from status display (#707, #603)
- Correct typo in entity_detector interactive classification prompt (#755)
- Prevent convo_miner from re-processing 0-chunk files on every run (#732, #654)
- Remove silent 8-line AI response truncation in convo_miner (#708, #692)
- Store full AI response in convo_miner exchange chunking (#695)
- Fix
mine --dry-runTypeError on files with room=None (#687, #586) - Skip arg whitelist for handlers accepting
**kwargs(#684, #572) - Allow Unicode in
sanitize_name()— Latvian, CJK, Cyrillic (#683, #637) - Auto-repair BLOB seq_ids from chromadb 0.6→1.5 migration (#664)
- Remove no-op
ORT_DISABLE_COREMLenv var (#653, #397) - Disambiguate hook block reasons to name MemPalace explicitly (#666)
- Use epsilon comparison for mtime to prevent unnecessary re-mining (#610)
- Correct token count estimate in compress summary (#609)
- Implement MCP ping health checks (#600)
- Align
cmd_compressdict keys withcompression_stats()return values (#569) - Skip unreachable reparse points in
detect_rooms_from_folderson Windows (#558) - Prevent HNSW index bloat from duplicate
add()calls (#544, #525) - Purge stale drawers before re-mine to avoid hnswlib segfault (#544)
- Mitigate system prompt contamination in search queries (#385, #333)
- Count Codex
user_messageturns in_count_human_messages(#373, #347) - Paginate large collection reads and surface errors in MCP tools (#371, #339, #338)
- Expand
~in split command directory argument (#361) - Ignore
wait_for_previousargument to support Gemini MCP clients (#322) - Close KnowledgeGraph SQLite connections in test fixtures (#450)
- Remove duplicate cache variable declarations in mcp_server.py (#449)
- Add
--yesflag to init instructions for non-interactive use (#682, #534) - Add
mcpcommand with setup guidance (#315)
- i18n support — 8 languages (en, es, fr, de, ja, ko, zh-CN, zh-TW) (#718)
- New MCP tools: get/list/update drawer, hook settings, export (#667, #635)
mempalace migrate— recover palaces from different ChromaDB versions (#502)- Add OpenClaw/ClawHub skill (#491)
- Backend seam for pluggable storage backends (#413)
- Disable broken auto-bump workflow (#414)
- Improve agent readiness — AGENTS.md, dependabot, CODEOWNERS, labels (#497)
- Add CLAUDE.md and mission/principles to AGENTS.md (#720)
- Add VitePress documentation site (#439)
- Add warning about fake MemPalace websites (#598)
- Fix stale org URLs and PR branch target in contributor docs (#679)
- Fix misaligned architecture diagram (#734, #733)
- Add ROADMAP.md — v3.1.1 stability patch and v4.0.0-alpha plan
- ruff format convo_miner.py (#741)
- ruff format all Python files (#675)
- CI: trigger tests on develop branch PRs and pushes (#674)
- CI: fix GitHub Pages publishing (#691)
3.1.0 — 2026-04-09
- Harden inputs, fix shell injection, optimize DB access (#387)
- Sanitize SESSION_ID in save hook to prevent path traversal (#141)
- Sanitize error responses and remove
sys.exitfrom library code (#139) - Shell injection fix in hooks, Claude Code mining, chromadb pin (#114)
- MCP null args hang, repair infinite recursion, OOM on large files (#399)
- Release ChromaDB handles before rmtree on Windows (#392)
- Use
os.utimein mtime test for Windows compatibility (#392) - Negotiate MCP protocol version instead of hardcoding (#324)
- Use upsert and deterministic IDs to prevent data stagnation (#140)
- Make
drawer_iddeterministic for idempotent writes (#387) - Honest AAAK stats — word-based token estimator, lossy labels (#147)
- Room detection checks keywords against folder paths (#145)
- Use actual detected room in mine summary stats (#165)
- Honour
--palaceflag in mcp_server (#264) - Preserve default KG path when
--palacenot passed (#270) --yesflag skips all interactive prompts in init (#123)- Repair command, split args, Claude export, room keywords (#119)
- Replace Unicode separator in convo_miner.py for Windows compatibility (#129)
- Coerce MCP integer arguments to native Python int (#84)
- Batch ChromaDB reads to avoid SQLite variable limit (#66)
- Respect nested .gitignore rules during mining (#78)
- Narrow bare
except Exceptionto specific types where safe (#54) - Mark MD5 as non-security in miner drawer ID generation (#53)
- Remove dead code and duplicate set items in entity_registry.py (#42)
- Silence ChromaDB telemetry warnings and CoreML segfault on Apple Silicon (#236)
- Unify package and MCP version reporting (#16)
- Fix broken AAAK Dialect link in README (#238)
- Update input prompt for entity confirmation (#83)
- Preserve CLI exit codes, log tracebacks, sanitize search errors (#139)
- Enable SQLite WAL mode and add consistent LIMIT to KG timeline (#136)
- Add limit=10000 safety cap to all unbounded ChromaDB
.get()calls (#137) - Re-mine modified files, idempotent
add_drawer, cleanup ChromaDB handles (#140) - Resolve formatting, regression logic, and pytest defaults (#270)
- Use
parse_known_argsto allow importing mcp_server during pytest (#270)
- Package MemPalace as standard Claude and Codex plugins (#270)
- Add OpenAI Codex CLI JSONL normalizer (#61)
- Add Codex plugin support with hooks, commands, and documentation (#270)
- Add command documentation for help, init, mine, search, and status (#270)
- Cache ChromaDB
PersistentClientinstead of re-instantiating per call (#135) - Tighten chromadb version range and add
py.typedmarker (#142) - Consolidate split known-names config loading (#22)
- CI: add separate jobs for Windows and macOS testing
- CI: Upgrade GitHub Actions for Node 24 compatibility (#55)
- Add Gemini CLI setup guide and integration section (#106)
- Add beginner-friendly hooks tutorial (#103)
- Align MCP setup examples with shipped server (#21)
- Honest README update — own the mistakes, fix the claims
- Expand test coverage from 20 to 92 tests, migrate to uv (#131)
- Add scale benchmark suite — 106 tests (#223)
- Increase test coverage from 30% to 85%, fix Windows encoding bugs (#281)
- Add WAL mode and entity timeline limit assertions
- Add coverage for
file_already_minedmtime check
3.0.0 — 2026-04-06
Initial public release.
- Palace architecture with day-based rooms, drawers (verbatim), and closets (searchable index)
- AAAK compression dialect for memory folding
- Knowledge graph with entity detection and timeline queries
- MCP server for Claude, Codex, and Gemini integration
- CLI:
init,mine,search,status,compress,repair,split - Benchmark suite with recall and scale tests
- README with MCP flow, local model flow, and specialist agent documentation