Skip to content

Latest commit

 

History

History
342 lines (274 loc) · 22.4 KB

File metadata and controls

342 lines (274 loc) · 22.4 KB

Changelog

All notable changes to MemPalace are documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


[3.3.3] — 2026-04-23

Bug Fixes

  • Install regressionmempalace-mcp console script is now declared in pyproject.toml alongside .claude-plugin/plugin.json's reference to it. In v3.3.2 the two drifted apart (plugin.json shipped the new "command": "mempalace-mcp" form before the matching entry point landed), so every fresh pip install mempalace==3.3.2 produced a Claude Code plugin config pointing at a binary that wasn't installed. (#1093, #340)
  • Restore silent-save visibility after the Claude Code 2.1.114 client regression — production transcript saves were failing silently until this PR. (#1021)
  • Paginate status-path metadata fetches so large palaces don't trip SQLite variable limits. (#851)
  • Resolve the Claude plugin hook runner across platform / plugin-dir variations; previously broke on Windows and some macOS layouts. (#942)
  • Real python3 resolution for .sh hooks with a MEMPAL_PYTHON override path. (#833)
  • Add optional wing parameter to tool_diary_write / tool_diary_read and derive per-project wing from the Claude Code transcript path when writing from the stop hook — diary entries from different projects no longer collapse into a shared default wing. (#659)
  • Treat empty string as "no filter" in mempalace_search wing/room; LLM agents that default to filling every optional parameter with "" no longer get bounced with must be a non-empty string. (#1097, #1084)

Improvements

  • Deterministic hook saves. Save hook now uses a silent Python API path, so successive hook invocations produce reproducible results and zero data loss on the hot path. (#673)
  • Graph cache with write-invalidation inside build_graph() — warm-path calls no longer rebuild the palace-graph per request. (#661)

Added

  • i18n: Belarusian translation. (#1051)
  • i18n: entity detection for German, Spanish, and French locales. (#1001)
  • i18n: Traditional + Simplified Chinese entity detection. (#945)

Known — deferred to v3.3.4

  • HNSW parallel-insert SIGSEGV when hnsw:num_threads is unset on collection creation (#974) — fix in-flight as #976, awaiting rebase against develop.

[3.3.2] — 2026-04-19

Bug Fixes

  • Fix silent drop of .jsonl files in project miner; raise MAX_FILE_SIZE cap from 10 MB to 500 MB so large transcripts no longer fall through unnoticed. Adds a tandem sweeper — a message-level, timestamp-coordinated, idempotent safety net that catches anything the primary miner missed. (#998)
  • mempalace sweep <target> CLI to run the sweeper on demand against a transcript file or a directory. (#998)
  • Guard Layer3.search_raw against None doc/meta rows returned by ChromaDB — prevents AttributeError crashes on mixed-schema palaces. (#1011, #1013)
  • Guard searcher API path, closet loop, and miner status histogram against None metadata; matching guards added to tool_status / list_wings / list_rooms / get_taxonomy in the MCP server. (#999)
  • Upgrade chromadb floor to >=1.5.4 for Python 3.13 / 3.14 compatibility and pin upper bound to <2 so future breaking majors don't silently install. (#1010)
  • Fix Unicode checkmark rendering on Windows terminals that can't encode the glyph — avoids UnicodeEncodeError crashes on first-run output. (#681)
  • quarantine_stale_hnsw — on open, detect HNSW segment directories whose data_level0.bin is significantly older than chroma.sqlite3 and rename them out of the way. Recovers cleanly from HNSW/sqlite drift that otherwise causes SIGSEGV on count() / query(...) (the chroma-core/chroma#2594 failure mode). Rebuilds the index lazily on next use. (#1000)
  • PID file guardmine writes a per-source-directory PID file and refuses to start if an existing mine is still running, preventing process stacking that bloats HNSW and wedges concurrent writes. Includes cross-platform PID liveness check (os.kill(pid, 0) terminates on Windows, so the guard falls back to a platform-aware probe). (#1023)

Improvements

  • RFC 001 §10 — typed backend contracts. BaseBackend now returns typed QueryResult / GetResult dataclasses and PalaceRef for palace identity; registry-based backend discovery. Internal refactor; no user-facing API change. (#995)
  • RFC 002 §9 — source adapter scaffolding. Introduces BaseSourceAdapter, adapter registry, and PalaceContext — the plumbing that future pluggable ingest sources will target. Internal refactor; no user-facing API change yet. (#1014)

Documentation

  • RFC 002 — full specification for the source adapter plugin system (future pluggable ingest). (#990)
  • First-run help text and README now reference the real ~/.claude/projects/<project>/ path shape instead of the placeholder /path/to/transcripts. (#996, #1012)

Internal

  • Harden sweeper for production: verbatim tool blocks, full session_id, logged failures.
  • Address Copilot review on #995: cursor tie-break, honest metrics, accurate comments.
  • Test hygiene: avoid ONNX network download in update-length validation tests; dedup update-length-validation tests; fix Windows file-lock in cache-invalidation test.

[3.3.1] — 2026-04-16

New Features

Multi-language entity detection — lexical patterns (person verbs, pronouns, dialogue markers, project verbs, stopwords, candidate character classes) now live in the optional entity section of each locale JSON under mempalace/i18n/<lang>.json. Every public function in entity_detector accepts a languages= tuple and unions patterns across enabled locales. Default stays ("en",) so existing English-only callers are unchanged. (#911)

  • Five new fully-supported locales with CLI strings, AAAK compression instructions, and entity-detection patterns:
    • Brazilian Portuguese pt-br (#156)
    • Russian ru (#760)
    • Italian it (#907)
    • Hindi hi (#773)
    • Indonesian id (#778)
  • MempalaceConfig.entity_languages — persistent palace-level language selection; MEMPALACE_ENTITY_LANGUAGES env override; mempalace init --lang en,pt-br flag that saves to ~/.mempalace/config.json (#911)
  • Per-language candidate_pattern — non-Latin scripts register their own character class, so names like João, Инна, राज are no longer silently dropped by the ASCII-only default (#911)
  • VSCode devcontainer matching the CI environment (#881)
  • MEMPAL_VERBOSE env toggle — developers see diaries surfaced in chat while the default remains silent (#871)
  • created_at timestamps included in search results (#846)

Bug Fixes

i18n / Unicode

  • Script-aware word boundaries for combining-mark scripts — Python's \b fails on Devanagari vowel signs (ा ी ु), Arabic, Hebrew, Thai, Tamil, Khmer etc., truncating names like अनीताअनीत and making person-verb patterns never fire. Locales now declare an optional boundary_chars field and the i18n loader expands \b into a script-aware lookaround boundary (#932)
  • Case-insensitive BCP 47 language code resolution — --lang PT-BR, zh-cn, Pt-Br previously fell through to English silently; now resolve to the canonical locale file via lowercase matching, with the entity-pattern cache keyed on the canonical form so casing variations share one cache entry (#928)
  • Wire i18n candidate patterns into miner._extract_entities_for_metadata(), palace.build_closet_lines(), and entity_registry.extract_unknown_candidates() — three code paths that still hardcoded ASCII-only [A-Z][a-z]{2,} and silently missed Cyrillic, accented Latin, and non-Latin entity metadata tags (#931)
  • Explicit encoding="utf-8" on Path.read_text() calls across entity_registry, instructions_cli, split_mega_files, and onboarding tests — prevents Windows GBK (and other non-UTF-8) locales from corrupting UTF-8 files (#946, #776)
  • ko.json status_drawers used {drawers} instead of {count}, showing the raw template string instead of the number (#758)
  • Move test_i18n.py from inside the installed package into tests/ so pytest actually collects it; remove the sys.path.insert hack (#758)
  • Dialect.from_config() defaulted to current_lang() (module-global) when config had no lang key — replaced with explicit "en" fallback for determinism (#758)

Other

  • Guard KnowledgeGraph.close() and query_relationship/timeline/stats methods with the instance lock to prevent concurrent-access corruption (#887, #884)
  • Replace invalid {"decision": "allow"} with {} in hook responses — the string wasn't a valid decision value and triggered schema warnings (#885)
  • entity_registry.research() defaults to local-only — previously made outbound Wikipedia HTTPS requests without explicit user opt-in; callers now must pass allow_network=True (#811)
  • Precompact hook no longer blocks compaction when it fails or takes too long (#856, #858, #863)
  • Redirect stdout to stderr during MCP server import so library logging can't corrupt the JSON-RPC channel (#225, #864)
  • mempalace init auto-adds per-project files to .gitignore in git repositories so users don't accidentally commit mempalace.yaml / entities.json (#185, #866)
  • Searcher guards against empty ChromaDB query results that previously raised on edge-case corpora (#195, #865)
  • Return empty status instead of an error on a cold-start palace with no drawers yet (#830, #831)
  • Restrict file permissions on sensitive palace data (#814)
  • Slack transcript importer writes a provenance header and preserves speaker IDs (#815)
  • Allow mempalace mine to run in directories without a local mempalace.yaml and surface the missing-yaml warning on stderr (#604)
  • Security hook injection fix (#812)
  • Save hook auto-mines transcripts even when MEMPAL_DIR is unset (#840)
  • Pin the Pages custom domain via a shipped CNAME in the deploy artifact (#877)
  • Version drift safeguard — sync pyproject + version.py + README badge in one place (#876)
  • Deploy docs workflow now runs on develop only, preventing accidental main-branch deploys (#845)

Improvements

  • Regex compilation optimization for entity extraction — pre-compile per-entity pattern sets once and cache by (name, languages) tuple, so multi-language callers don't thrash the cache (#880)
  • Knowledge-graph value sanitization now preserves natural punctuation (commas, colons, parentheses) that commonly appears in KG subject/object values (#873)

Documentation

  • Clarify that mempalace init requires a <dir> argument in CLI help text (#210, #862)
  • Domain name and specific impostor sites called out in the scam-alert section (#869)
  • Tightened SECURITY.md with a real version-support policy and the GHPVR-only reporting channel (#810)
  • Fixed stale pyproject.toml URLs (#853)
  • v4 planning prep (#852)

Internal

  • palace_graph tunnel helper test coverage (#908)

[3.3.0] — 2026-04-13

New Features

  • Closet layer — a compact searchable index of pointers to verbatim drawers, enabling fast topical lookup without reading all content (#788)
  • BM25 hybrid search — closets boost ranking, drawers remain the source of truth (#795, #829)
  • Entity metadata on every drawer for filterable search (#829)
  • Diary ingest — day-based rooms for conversation transcripts (#829)
  • Cross-wing tunnels — explicit links between rooms in different wings for multi-project agents (#829)
  • Drawer-grep — returns the best-matching chunk plus adjacent context drawers (#829)
  • Offline fact checker against the entity registry and knowledge graph (#829)
  • LLM-based closet regeneration — optional, bring-your-own endpoint, no mandatory API key (#793)
  • Hall detection — routes drawer content to emotions / technical / family / memory / identity / consciousness / creative halls, enabling hall-based graph connectivity within wings (#835)

Bug Fixes

  • Set hnsw:space=cosine metadata on all collection creation sites — fixes broken similarity scoring under ChromaDB's default L2 distance (#807, #218)
  • File-level locking prevents duplicate drawers when agents mine the same file concurrently (#784, #826)
  • Hybrid closet+drawer retrieval — closets boost ranking, never gate results (#795)
  • Stop hooks from making agents write in chat — saves tokens on every turn (#786)
  • Strip system tags, hook output, and Claude UI chrome from drawers before filing (#785)
  • Verbatim-safe strip_noise scoped to Claude Code JSONL only (#785)
  • Prevent diary entry ID collisions via microsecond timestamp and full content hash (#819)
  • Auto-rebuild stale drawers via NORMALIZE_VERSION schema gate
  • Enforce atomic topics in closets and extract richer pointers
  • Sync version.py to match pyproject.toml (#820)
  • Remove unused main import from mempalace/__init__.py (#827)
  • README audit — fix 7 stale claims (tool count, version badge, wake-up token cost, dialect.py lossless disclaimer, pyproject.toml version) with 42 regression-guard tests (#835)

Improvements

  • Optimize entity detection with regex caching and pre-compilation (#828)
  • Extract locked filing block into helper to keep mine_convos under C901 complexity

Documentation

  • Add docs/CLOSETS.md — closet layer overview
  • Fix stale milla-jovovich/* org URLs in website and plugin manifests (#787)
  • Fix remaining stale org URLs in contributor docs (#808)
  • Rewrite README.md and mempalaceofficial.com benchmark pages to remove category-error cross-system comparisons (R@5 retrieval recall had been listed next to competitor QA accuracy under one column), remove the retracted "+34% palace boost" claim from the surfaces where it had remained, replace the 100% Haiku-rerank headline with the honest held-out 98.4% R@5, drop the LoCoMo 100% top-50 row (retrieval-bypass artefact), and fix the broken aya-thekeeper/mempal reproduction URL (#875)
  • Add docs/HISTORY.md as the canonical home for corrections, retractions, and public notices; move the 2026-04-07 "Note from Milla & Ben" and the 2026-04-11 impostor-domain notice out of README.md
  • Add v3.3.0 reproduction result JSONLs and the deterministic seed=42 50/450 LongMemEval split under benchmarks/ — every BENCHMARKS.md claim reproduces exactly

Internal

  • Add test coverage for mine_lock, closets, entity metadata, BM25, and diary
  • Verify mine_lock via disjoint critical-section intervals
  • Serialize mine_lock concurrency test with multiprocessing
  • Make diary state path assertion platform-neutral
  • Add TestTunnels coverage for cross-wing tunnel operations
  • Ruff format with CI-pinned version (0.4.x); format mempalace/palace.py

3.2.0 — 2026-04-12

Packaging

  • Remove chromadb<0.7 upper bound — unblocks installs against chromadb 1.x palaces (#690)
  • Bump version to 3.2.0 across pyproject.toml, mempalace/version.py, README badge, and OpenClaw SKILL (#761)

Security

  • Harden palace deletion, WAL redaction, and MCP search input handling (#739)
  • Consistent input validation, argument whitelisting, concurrency safety, and WAL fixes (#647)
  • Remove hardcoded credential paths from benchmark runners (#177)
  • Remove global SSL verification bypass in convomem_bench (#176)

Bug Fixes

  • Parse Claude.ai privacy export with messages key and sender field (#685, #677)
  • Detect mtime changes in _get_client to prevent stale HNSW index (#757)
  • Hash full content in tool_add_drawer drawer ID — stable re-mines (#716)
  • Remove 10k drawer cap from status display (#707, #603)
  • Correct typo in entity_detector interactive classification prompt (#755)
  • Prevent convo_miner from re-processing 0-chunk files on every run (#732, #654)
  • Remove silent 8-line AI response truncation in convo_miner (#708, #692)
  • Store full AI response in convo_miner exchange chunking (#695)
  • Fix mine --dry-run TypeError on files with room=None (#687, #586)
  • Skip arg whitelist for handlers accepting **kwargs (#684, #572)
  • Allow Unicode in sanitize_name() — Latvian, CJK, Cyrillic (#683, #637)
  • Auto-repair BLOB seq_ids from chromadb 0.6→1.5 migration (#664)
  • Remove no-op ORT_DISABLE_COREML env var (#653, #397)
  • Disambiguate hook block reasons to name MemPalace explicitly (#666)
  • Use epsilon comparison for mtime to prevent unnecessary re-mining (#610)
  • Correct token count estimate in compress summary (#609)
  • Implement MCP ping health checks (#600)
  • Align cmd_compress dict keys with compression_stats() return values (#569)
  • Skip unreachable reparse points in detect_rooms_from_folders on Windows (#558)
  • Prevent HNSW index bloat from duplicate add() calls (#544, #525)
  • Purge stale drawers before re-mine to avoid hnswlib segfault (#544)
  • Mitigate system prompt contamination in search queries (#385, #333)
  • Count Codex user_message turns in _count_human_messages (#373, #347)
  • Paginate large collection reads and surface errors in MCP tools (#371, #339, #338)
  • Expand ~ in split command directory argument (#361)
  • Ignore wait_for_previous argument to support Gemini MCP clients (#322)
  • Close KnowledgeGraph SQLite connections in test fixtures (#450)
  • Remove duplicate cache variable declarations in mcp_server.py (#449)
  • Add --yes flag to init instructions for non-interactive use (#682, #534)
  • Add mcp command with setup guidance (#315)

New Features

  • i18n support — 8 languages (en, es, fr, de, ja, ko, zh-CN, zh-TW) (#718)
  • New MCP tools: get/list/update drawer, hook settings, export (#667, #635)
  • mempalace migrate — recover palaces from different ChromaDB versions (#502)
  • Add OpenClaw/ClawHub skill (#491)
  • Backend seam for pluggable storage backends (#413)

Improvements

  • Disable broken auto-bump workflow (#414)
  • Improve agent readiness — AGENTS.md, dependabot, CODEOWNERS, labels (#497)

Documentation

  • Add CLAUDE.md and mission/principles to AGENTS.md (#720)
  • Add VitePress documentation site (#439)
  • Add warning about fake MemPalace websites (#598)
  • Fix stale org URLs and PR branch target in contributor docs (#679)
  • Fix misaligned architecture diagram (#734, #733)
  • Add ROADMAP.md — v3.1.1 stability patch and v4.0.0-alpha plan

Internal

  • ruff format convo_miner.py (#741)
  • ruff format all Python files (#675)
  • CI: trigger tests on develop branch PRs and pushes (#674)
  • CI: fix GitHub Pages publishing (#691)

3.1.0 — 2026-04-09

Security

  • Harden inputs, fix shell injection, optimize DB access (#387)
  • Sanitize SESSION_ID in save hook to prevent path traversal (#141)
  • Sanitize error responses and remove sys.exit from library code (#139)
  • Shell injection fix in hooks, Claude Code mining, chromadb pin (#114)

Bug Fixes

  • MCP null args hang, repair infinite recursion, OOM on large files (#399)
  • Release ChromaDB handles before rmtree on Windows (#392)
  • Use os.utime in mtime test for Windows compatibility (#392)
  • Negotiate MCP protocol version instead of hardcoding (#324)
  • Use upsert and deterministic IDs to prevent data stagnation (#140)
  • Make drawer_id deterministic for idempotent writes (#387)
  • Honest AAAK stats — word-based token estimator, lossy labels (#147)
  • Room detection checks keywords against folder paths (#145)
  • Use actual detected room in mine summary stats (#165)
  • Honour --palace flag in mcp_server (#264)
  • Preserve default KG path when --palace not passed (#270)
  • --yes flag skips all interactive prompts in init (#123)
  • Repair command, split args, Claude export, room keywords (#119)
  • Replace Unicode separator in convo_miner.py for Windows compatibility (#129)
  • Coerce MCP integer arguments to native Python int (#84)
  • Batch ChromaDB reads to avoid SQLite variable limit (#66)
  • Respect nested .gitignore rules during mining (#78)
  • Narrow bare except Exception to specific types where safe (#54)
  • Mark MD5 as non-security in miner drawer ID generation (#53)
  • Remove dead code and duplicate set items in entity_registry.py (#42)
  • Silence ChromaDB telemetry warnings and CoreML segfault on Apple Silicon (#236)
  • Unify package and MCP version reporting (#16)
  • Fix broken AAAK Dialect link in README (#238)
  • Update input prompt for entity confirmation (#83)
  • Preserve CLI exit codes, log tracebacks, sanitize search errors (#139)
  • Enable SQLite WAL mode and add consistent LIMIT to KG timeline (#136)
  • Add limit=10000 safety cap to all unbounded ChromaDB .get() calls (#137)
  • Re-mine modified files, idempotent add_drawer, cleanup ChromaDB handles (#140)
  • Resolve formatting, regression logic, and pytest defaults (#270)
  • Use parse_known_args to allow importing mcp_server during pytest (#270)

New Features

  • Package MemPalace as standard Claude and Codex plugins (#270)
  • Add OpenAI Codex CLI JSONL normalizer (#61)
  • Add Codex plugin support with hooks, commands, and documentation (#270)
  • Add command documentation for help, init, mine, search, and status (#270)

Improvements

  • Cache ChromaDB PersistentClient instead of re-instantiating per call (#135)
  • Tighten chromadb version range and add py.typed marker (#142)
  • Consolidate split known-names config loading (#22)
  • CI: add separate jobs for Windows and macOS testing
  • CI: Upgrade GitHub Actions for Node 24 compatibility (#55)

Documentation

  • Add Gemini CLI setup guide and integration section (#106)
  • Add beginner-friendly hooks tutorial (#103)
  • Align MCP setup examples with shipped server (#21)
  • Honest README update — own the mistakes, fix the claims

Internal

  • Expand test coverage from 20 to 92 tests, migrate to uv (#131)
  • Add scale benchmark suite — 106 tests (#223)
  • Increase test coverage from 30% to 85%, fix Windows encoding bugs (#281)
  • Add WAL mode and entity timeline limit assertions
  • Add coverage for file_already_mined mtime check

3.0.0 — 2026-04-06

Initial public release.

  • Palace architecture with day-based rooms, drawers (verbatim), and closets (searchable index)
  • AAAK compression dialect for memory folding
  • Knowledge graph with entity detection and timeline queries
  • MCP server for Claude, Codex, and Gemini integration
  • CLI: init, mine, search, status, compress, repair, split
  • Benchmark suite with recall and scale tests
  • README with MCP flow, local model flow, and specialist agent documentation