forked from MemPalace/mempalace
-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathfork-changes.yaml
More file actions
470 lines (445 loc) · 21.7 KB
/
fork-changes.yaml
File metadata and controls
470 lines (445 loc) · 21.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
# Fork change manifest — canonical source for the fork-ahead narrative.
#
# Render targets (regenerated from this file by scripts/render-docs.py):
# - FORK_CHANGELOG.md (today)
# - README.md fork-change-queue table (planned)
# - CLAUDE.md row inventory (planned)
# - ~/.claude/projects/-home-jp-Projects-memorypalace/scratch/promises.md (planned)
#
# Schema (per entry):
# id short slug, used as anchor (kebab-case)
# date YYYY-MM-DD (entry date, not commit date)
# bucket Added | Changed | Fixed | Performance
# commit 7-char short SHA in this fork
# area Reliability | Search | Performance | CLI | Docs | Testing
# summary one-line (<100 chars) terse
# body Markdown paragraph(s) for the changelog
# tests optional — count + class names
# pr optional — upstream PR # if filed
# pr_state optional — OPEN | MERGED | CLOSED at last check
# files optional list — repo-relative paths the change touched
# supersedes optional list of `id`s this entry supersedes
#
# Order in this file is presentation order in the changelog (newest first).
entries:
- id: chunking-defaults-no-materialize
date: 2026-05-03
bucket: Fixed
commit: 6ce37c0
area: CLI
summary: "`cfg.init()` no longer materializes chunking defaults into `config.json`"
body: |
`cfg.init()` was unconditionally writing ``chunk_size: 800``,
``chunk_overlap: 100``, and ``min_chunk_size: 50`` into
``config.json`` on first run. The values match ``miner.py``'s
module-level constants but conflict with ``convo_miner.py``'s
stricter ``MIN_CHUNK_SIZE = 30`` floor — and ``convo_miner.py``
lines 427-431 explicitly distinguishes "user has tuned this"
from "user is on defaults" by checking
``_file_config.get("min_chunk_size") is None``. Materializing
the value as a default broke that detection: any user who ran
``mempalace init`` then mined conversations would silently lose
exchanges shorter than 50 characters, even though the convo
miner's intended floor is 30.
Surfaced by a pytest fixture leak. ``tests/conftest.py:21-27``
redirects ``HOME`` to a session-tmp directory so tests don't
trash the real ``~/.mempalace``. The first test that calls
``cmd_init`` writes the bloated default config into the
session-tmp ``~/.mempalace``, and downstream
``test_convo_miner`` runs (in-process, same session) then read
``min_chunk_size: 50`` and skip the test fixture's ~30-char
exchanges entirely. Both tests pass in isolation; the second
fails when chained.
Fix: drop the three chunking keys from ``cfg.init()``'s
default-config-write. The
``MempalaceConfig.chunk_size``/``.chunk_overlap``/``.min_chunk_size``
properties already provide the right fallbacks via
``_file_config.get(key, default)`` when the key is absent.
Users who want to tune chunking still set the keys explicitly;
the contract ``convo_miner.py`` relies on (``is None`` ⇔
"untuned") is restored.
Same fix pushed to the open #1024 PR branch as commit
``df9187c`` so the bug doesn't get reintroduced when #1024
merges. Amends fork-ahead row 17.
tests: "1548/1548 (was 1546/1548 with 2 isolation failures in test_convo_miner)"
pr: 1024
pr_state: OPEN
files:
- mempalace/config.py
supersedes: []
- id: kind-filter-retired
date: 2026-04-27
bucket: Changed
commit: 7ba28dc
area: Search
summary: "Retire the `kind=` filter — structural split made it inert"
body: |
Phases A–E of the checkpoint collection split (2026-04-25 → 2026-04-26)
moved every Stop-hook auto-save checkpoint drawer to the dedicated
``mempalace_session_recovery`` collection. Empirical check on the
canonical 151K palace: ``mempalace_drawers`` has zero
``topic=checkpoint`` and zero ``topic=auto-save`` drawers; recovery
collection holds 763. The ``kind=`` post-filter was filtering nothing.
Deleted: ``_CHECKPOINT_TOPICS`` (moved to ``palace.py`` for write-side
routing), ``_is_checkpoint_drawer``, ``_apply_kind_text_filter``, the
``max(n*20, 100)`` over-fetch hack (back to standard ``n_results * 3``),
the ``kind=`` parameter on ``search_memories`` / ``build_where_filter`` /
CLI ``search`` / ``mempalace_search`` MCP tool input_schema, and
``TestCheckpointFilter`` (9 tests). Companion fix in
[palace-daemon](https://github.com/jphein/palace-daemon/commit/4a318d3)
(v1.7.1) drops ``kind=`` from ``/search`` and ``/context`` HTTP routes.
tests: "−9 (TestCheckpointFilter deleted; suite at 1500)"
files:
- mempalace/searcher.py
- mempalace/mcp_server.py
- mempalace/palace.py
- mempalace/migrate.py
- mempalace/layers.py
- tests/test_searcher.py
- id: closet-boost-ablation
date: 2026-04-27
bucket: Changed
commit: 3cb03f3
area: Search
summary: "Hoist CLOSET_RANK_BOOSTS to module level + record VecRecall ablation finding"
body: |
Two-step refactor + measurement. First (commit ``f558d3c``):
hoist ``CLOSET_RANK_BOOSTS = [0.40, 0.25, 0.15, 0.08, 0.04]`` and
``CLOSET_DISTANCE_CAP`` from inside ``search_memories`` to module
scope so they can be tuned from the outside (env var, config flag,
or in-process patch for A/B benchmarking) without touching the
function. No behavior change; pure ablation enablement.
Then (commit ``3cb03f3``): A/B ablation against the 151K canonical
palace (12-probe set covering recent fork-side decisions + mined-file
content). Closet boost fires on ~20% of result rows, concentrated
in queries whose answer lives in mined files; closets are sparse on
chat-transcript queries (most fork-side decisions). When the boost
fired, it re-ordered chunks within a single source file rather than
displacing right answers with wrong ones — i.e. VecRecall's critique
([discussions/1129](https://github.com/MemPalace/mempalace/discussions/1129),
"org-layer in retrieval path drops R@5") did not reproduce here.
Hybrid degrades to effectively pure-vector for transcript queries
and re-ranks within-file chunks for mined-file queries; neither
shape matches the failure mode VecRecall is fixing. Findings noted
in the comment block above the constants so future-us doesn't have
to re-run the experiment.
files:
- mempalace/searcher.py
- id: plugin-manifest-scrub
date: 2026-04-27
bucket: Fixed
commit: 9f91e18
area: Reliability
summary: "Strip embedded API key from .claude-plugin/ manifests; rely on env inheritance"
body: |
``.claude-plugin/.mcp.json`` and ``.claude-plugin/hooks/hooks.json``
shipped with a real (rotated) API key embedded as a literal in the
manifest's ``env`` block, plus my homelab daemon URL. Both are
committed plugin templates that get pulled into every plugin install.
Fix in two commits: ``8119149`` reverted both manifests to the
upstream-shape (no env block, in-process MCP), then ``9f91e18``
restored daemon-routing on ``.mcp.json`` (URL + path) but **without**
the embedded credential — ``PALACE_API_KEY`` now inherits at runtime
from ``~/.claude/settings.local.json``'s ``env`` block (which
Claude Code passes to spawned MCP servers and hooks).
Net: my fork-main carries the daemon-routed config matching production
deployment; the literal credential lives one place only (gitignored
``settings.local.json``); future plugin installs inherit env rather
than carrying a stale embedded key. Companion to palace-daemon
[PR #12](https://github.com/rboarescu/palace-daemon/pull/12) which
fixes the same class of embedded-default in ``clients/palace-mode``.
files:
- .claude-plugin/.mcp.json
- .claude-plugin/hooks/hooks.json
- id: cherry-pick-1094
date: 2026-04-26
bucket: Changed
commit: 43d728d
area: Reliability
summary: "Cherry-pick #1094 — coerce None metadatas at chromadb boundary"
body: |
Fork main was carrying the per-site ``meta = meta or {}`` guards
from #999 in eight read paths but didn't have the boundary
coercion that closes the issue once for all callers. The typed
``QueryResult``/``GetResult`` contract declares
``metadatas: list[dict]``, never ``list[Optional[dict]]`` — so
every call site that forgot the per-site guard was a latent
``AttributeError``. #1094 (open upstream, jp-authored) coerces
at ``ChromaCollection.query()`` / ``.get()`` so downstream
callers always receive ``list[dict]``. Per-site guards retained
as belt-and-suspenders for paths that might bypass the typed
wrappers. Three same-family fork-ahead PRs (#1198, #1201, #1083
review) all pointed at gaps that would have been impossible if
this pattern had been in place.
tests: "6 in test_backends.py (mixed/all-None inner lists, padding regression, get-without-metadatas)"
pr: 1094
pr_state: OPEN
files:
- mempalace/backends/chroma.py
- tests/test_backends.py
- id: cherry-pick-1087-rewrite
date: 2026-04-26
bucket: Changed
commit: 366a9ad
area: CLI
summary: "Cherry-pick #1087 rewrite — collection.delete(where=) instead of nuke-and-rebuild"
body: |
Fork main had been carrying ``cmd_purge``'s nuke-and-rebuild
shape (extract survivors, ``shutil.rmtree``, recreate, re-insert).
Cherry-picked the post-review rewrite from PR #1087's branch:
``ChromaBackend.get_collection`` + ``col.delete(where=...)``.
The race in #521 is on the upsert path
(``updatePoint`` / ``repairConnectionsForUpdate``) — filter-delete
doesn't reach it. Five fixes from @igorls's review now apply to
our own purge: embedding function preserved, no rmtree window,
routes through the backend, ``confirm_destructive_action`` reused,
end-to-end test covers the embedding-fn-survival path.
tests: "5 in test_cli.py (TestCmdPurge + e2e)"
pr: 1087
pr_state: OPEN
files:
- mempalace/cli.py
- tests/test_cli.py
- id: doc-canonical-source
date: 2026-04-26
bucket: Added
commit: 5a01aec
area: Docs
summary: "Canonical YAML manifest + renderer for fork-ahead docs"
body: |
The fork-ahead narrative previously lived (and drifted) across four
hand-edited files: README's fork-change-queue table, CLAUDE.md's row
inventory, FORK_CHANGELOG.md, and the promises tracker. New
``docs/fork-changes.yaml`` is now the canonical source; running
``scripts/render-docs.py`` regenerates FORK_CHANGELOG.md.
``scripts/check-docs.sh`` extended with a render-parity check that
detects YAML→FORK_CHANGELOG drift, plus the existing test-count /
commit-hash / upstream-PR-state checks. Researched towncrier, scriv,
git-cliff, antsibull-changelog — none do single-source →
multi-target render in this shape. README/CLAUDE/promises
rendering planned for follow-on commits with marker-based
insertion.
files:
- docs/fork-changes.yaml
- scripts/render-docs.py
- scripts/check-docs.sh
- FORK_CHANGELOG.md
- CLAUDE.md
- id: phase-d-precompact
date: 2026-04-26
bucket: Added
commit: 42817d7
area: Reliability
summary: "Phase D migration + PreCompact recovery write"
body: |
``migrate_checkpoints_to_recovery(palace_path, batch_size=1000)`` walks
the main collection in pages, filters drawers with topic in
``_CHECKPOINT_TOPICS`` in Python (avoids the chromadb 1.5.x ``$in``/``$nin``
filter-planner bug), copies them to the recovery collection
(preserving IDs + metadata), then deletes from main. Idempotent —
re-running on a fully-reorganized palace returns 0. Add-then-delete
order: a crash mid-migration leaves a duplicate, not a loss.
Wired into ``mempalace repair --mode reorganize`` for explicit operator
runs. PreCompact incorporated — ``hook_precompact`` now writes a
session-recovery marker mirroring Stop, so context-compaction events
leave a queryable timestamp in the recovery collection rather than
nothing. Failures are non-fatal (logged; mining + compaction still
proceed).
tests: "6 in TestMigrateCheckpointsToRecovery + 1 in test_hooks_cli"
files:
- mempalace/migrate.py
- mempalace/cli.py
- mempalace/hooks_cli.py
- tests/test_migrate.py
- id: drawer-id-surfacing
date: 2026-04-26
bucket: Added
commit: 9a8bb77
area: Search
summary: "Surface drawer_id in search/diary/recovery payloads"
body: |
ChromaDB's primary key was always returned by ``query()`` and ``get()``
but never plumbed into result-building loops; consumers (e.g.
familiar.realm.watch's citation-popover loop) couldn't link a hit
back to the underlying drawer. Three call sites updated for parity:
``searcher.search_memories`` (vector path + sqlite BM25 fallback),
``mcp_server.tool_session_recovery_read``, ``mcp_server.tool_diary_read``.
Defensive zip with id-pad: production chromadb always returns ids,
but several test mocks omit them — pad with ``None`` when absent so
existing fixtures keep working without touching N tests.
tests: "1 integration + 1 inline assertion"
files:
- mempalace/searcher.py
- mempalace/mcp_server.py
- website/reference/mcp-tools.md
- id: hnsw-integrity-gate
date: 2026-04-26
bucket: Fixed
commit: 645ba20
area: Reliability
summary: "Integrity gate prevents quarantine_stale_hnsw from destroying healthy indexes"
body: |
Previous behavior fired whenever ``sqlite_mtime - hnsw_mtime`` exceeded
the (lowered, in #1173) 300s threshold. ChromaDB 1.5.x flushes HNSW
asynchronously and a clean shutdown does not force-flush, so the
on-disk HNSW is always meaningfully older than ``chroma.sqlite3`` —
that's the steady state, not corruption. Quarantine renamed valid
HNSW segments on every cold-start; chromadb created empty replacements;
vector recall went to 0/N until rebuild. Confirmed in production on
the disks daemon journal 2026-04-26 06:56:45: three of three healthy
253MB segments quarantined on cold-start with 538-557s gaps. Fix:
stage 2 integrity gate sniffs the chromadb segment metadata file
for its protocol/terminator bytes (PROTO ``\x80`` head, STOP ``\x2e``
tail) and a non-trivial size, **without deserializing**. Healthy
segment with mtime drift → keep in place; truncated/zero-filled →
quarantine.
tests: "4 in test_backends.py (renames-corrupt, leaves-healthy-with-drift, leaves-no-metadata, renames-truncated)"
pr: 1173
pr_state: MERGED
merged_date: 2026-04-26
files:
- mempalace/backends/chroma.py
- tests/test_backends.py
- id: hnsw-cold-start-gate
date: 2026-04-25
bucket: Fixed
commit: 70c4bc6
area: Reliability
summary: "Gate quarantine_stale_hnsw to once-per-palace-per-process"
body: |
``make_client()`` previously invoked ``quarantine_stale_hnsw`` on every
reconnect; under steady write load the proactive check kept firing,
racking up ``.drift-*`` directories every 10–30 minutes. New
``ChromaBackend._quarantined_paths: set[str]`` caps it to one fire on
first open per palace per process. Real cold-start drift still caught
(replicated/restored palace); real runtime errors still caught via
palace-daemon's ``_auto_repair``, which calls ``quarantine_stale_hnsw``
directly and bypasses this gate.
tests: "2 in test_backends.py (single-fire-per-palace, per-palace independence)"
pr: 1173
pr_state: MERGED
merged_date: 2026-04-26
files:
- mempalace/backends/chroma.py
- tests/test_backends.py
- tests/conftest.py
- id: cherry-pick-1085
date: 2026-04-26
bucket: Performance
commit: 6be6fff
area: Performance
summary: "Cherry-pick #1085 — batch ChromaDB inserts in miner (10–30× faster)"
body: |
Cherry-picked from upstream PR
[#1085](https://github.com/MemPalace/mempalace/pull/1085) (@midweste,
OPEN as of 2026-04-26). New ``_build_drawer()`` helper + ``add_drawers()``
batch-insert path; ``process_file`` hands the full chunk list to
``add_drawers`` instead of looping per-chunk. Hoists ``datetime.now()``
and ``os.path.getmtime()`` to file-level (2 syscalls per file instead
of 2N). Reported 10–30× mining speedup upstream. Fork-side resolution
preserved fork's existing ``DRAWER_UPSERT_BATCH_SIZE=1000``; aliased
upstream's ``CHROMA_BATCH_LIMIT`` to it. Becomes a no-op when #1085
merges to develop and we next sync.
pr: 1085
pr_state: OPEN
files:
- mempalace/miner.py
- id: deploy-script
date: 2026-04-26
bucket: Added
commit: 8252025
area: CLI
summary: "scripts/deploy.sh — one-command Syncthing-aware redeploy"
body: |
Single command does the right shape: push fork main → wait for
Syncthing to reach ``/mnt/raid/projects/memorypalace`` on the deploy
host → ``systemctl --user restart palace-daemon`` → poll ``/health`` →
ssh-import-check that today's fork-ahead surface is loaded.
Replaces a three-step manual ritual that was easy to get wrong
(e.g. ``pip install --upgrade`` was a no-op on the editable install).
files:
- scripts/deploy.sh
- id: phase-a-c-checkpoint-split
date: 2026-04-25
bucket: Added
commit: e266365
area: Search
summary: "Phases A–C of the checkpoint collection split"
body: |
New ``mempalace_session_recovery`` collection adapter
(``_SESSION_RECOVERY_COLLECTION`` + ``get_session_recovery_collection``
in ``palace.py``); ``tool_diary_write`` routes ``topic in _CHECKPOINT_TOPICS``
to it. New ``mempalace_session_recovery_read`` MCP tool reads recovery
collection only with optional filters (session_id, agent, since,
until, wing, limit). Promoted from "future work" to "necessary" by
the same-day Cat 9 A/B (``kind=all`` 632 tokens/Q vs ``kind=content``
3 tokens/Q on the canonical 151K-drawer palace). Design doc at
``docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md``.
tests: "12 across test_session_recovery.py + TestCheckpointRouting + TestSessionRecoveryRead"
files:
- mempalace/palace.py
- mempalace/mcp_server.py
- tests/test_session_recovery.py
- tests/test_mcp_server.py
- website/reference/mcp-tools.md
- id: palace-graph-none-guard
date: 2026-04-25
bucket: Fixed
commit: 5fd15db
area: Reliability
summary: "palace_graph.build_graph skips None metadata"
body: |
``palace_graph.py:95`` was calling ``meta.get("room", "")`` unconditionally;
ChromaDB returns ``None`` for legacy/partial-write drawers, taking out
every consumer of ``build_graph`` (graph_stats, find_tunnels, traverse,
the daemon's ``/stats``). Caught by palace-daemon's ``verify-routes.sh``
smoke test. Same family as upstream's #999 None-metadata audit, in a
read path the audit didn't reach.
pr: 1201
pr_state: MERGED
merged_date: 2026-04-26
files:
- mempalace/palace_graph.py
- id: kind-filter
date: 2026-04-25
bucket: Fixed
commit: f9f5cc4
area: Search
summary: "kind= filter on search_memories excludes Stop-hook checkpoints (transitional)"
body: |
Three values: ``"content"`` (default, excludes), ``"checkpoint"``
(recovery/audit only), ``"all"`` (no filter). Two same-day architecture
corrections: (a) the where-clause filter (``topic $nin [...]``) tripped
a chromadb 1.5.x filter-planner bug; the exclusion moved to post-filter
only ([398f42f](https://github.com/jphein/mempalace/commit/398f42f));
(b) vector top-N is dominated by checkpoints on this palace, so
post-filter alone empties the result set without aggressive over-fetch
— pull size raised to ``max(n*20, 100)`` for ``kind != "all"`` (this commit).
Safety net during the transition; once Phase D ships and existing
checkpoints migrate, the post-filter and over-fetch hack become
deletable.
tests: "9 in TestCheckpointFilter"
supersedes: []
files:
- mempalace/searcher.py
- mempalace/mcp_server.py
- tests/test_searcher.py
# Sections rendered to the bottom of FORK_CHANGELOG.md
merged_upstream:
trim_after_days: 30
notes:
- "Trim entries from this list once they're more than ~30 days old."
- "See CHANGELOG.md (upstream) for the full released history."
entries:
- { pr: 1173, title: "quarantine_stale_hnsw on make_client + cold-start gate + integrity sniff", merged: 2026-04-26 }
- { pr: 1177, title: "`.blob_seq_ids_migrated` marker guard (closes #1090)", merged: 2026-04-26 }
- { pr: 1198, title: "_tokenize None-document guard in BM25 reranker", merged: 2026-04-26 }
- { pr: 1201, title: "palace_graph.build_graph skips None metadata", merged: 2026-04-26 }
- { pr: 659, title: "diary `wing` parameter", merged: 2026-04-23 }
- { pr: 661, title: "graph cache with write-invalidation", merged: 2026-04-22 }
- { pr: 673, title: "deterministic hook saves", merged: 2026-04-22 }
- { pr: 1021, title: "Claude Code 2.1.114 stdout/silent_save fixes", merged: 2026-04-22 }
- { pr: 999, title: "None-metadata guards across read paths", merged: 2026-04-18 }
- { pr: 1000, title: "quarantine_stale_hnsw shipped", released_in: "v3.3.2" }
- { pr: 1023, title: "PID file guard prevents stacking mine processes", released_in: "v3.3.2" }
- { pr: 681, title: "Unicode checkmark → ASCII", released_in: "v3.3.2" }