perf: graph cache with write-invalidation in build_graph()#661
perf: graph cache with write-invalidation in build_graph()#661bensig merged 3 commits intoMemPalace:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a short-lived (60s TTL) module-level cache for build_graph() to avoid repeated full metadata scans when multiple MCP tools build the palace graph in quick succession.
Changes:
- Added module-level graph cache state +
invalidate_graph_cache()helper inmempalace/palace_graph.py. - Updated
build_graph()to return cached results within TTL and to store results after a scan. - Added cache-focused tests and per-test cache invalidation to prevent cross-test interference.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
mempalace/palace_graph.py |
Adds TTL-based caching for graph builds and an explicit invalidation function. |
tests/test_palace_graph.py |
Adds cache invalidation in test setup and new tests asserting cache/invalidation behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Module-level graph cache — mirrors _metadata_cache pattern in mcp_server.py | ||
| _graph_cache_nodes = None | ||
| _graph_cache_edges = None | ||
| _graph_cache_time = 0.0 | ||
| _GRAPH_CACHE_TTL = 60.0 # seconds — graph changes less often than metadata | ||
|
|
||
|
|
||
| def invalidate_graph_cache(): | ||
| """Clear the graph cache. Called from mcp_server.py on writes.""" | ||
| global _graph_cache_nodes, _graph_cache_edges, _graph_cache_time |
There was a problem hiding this comment.
Fixed — removed the stale reference to _metadata_cache. Comment is now self-contained: module-level cache with 60s TTL and write-invalidation. 277e476
| global _graph_cache_nodes, _graph_cache_edges, _graph_cache_time | ||
| now = time.time() | ||
| if _graph_cache_nodes is not None and (now - _graph_cache_time) < _GRAPH_CACHE_TTL: | ||
| return _graph_cache_nodes, _graph_cache_edges | ||
|
|
There was a problem hiding this comment.
Documented — added comment explaining this is intentional for the MCP server's single-palace use case. Callers switching collections should call invalidate_graph_cache() first. 277e476
| _graph_cache_nodes = nodes | ||
| _graph_cache_edges = edges | ||
| _graph_cache_time = time.time() |
There was a problem hiding this comment.
Fixed — empty graphs are no longer cached (if nodes: guard). New data is picked up immediately when palace is first populated. 277e476
| # Second call with a *different* collection — should still return cached result | ||
| col2 = _make_fake_collection([]) | ||
| nodes2, edges2 = build_graph(col=col2) | ||
| assert nodes1 == nodes2 | ||
| assert edges1 == edges2 |
There was a problem hiding this comment.
Documented — expanded docstring to explain this verifies intentional single-palace cache behavior. If multi-palace support is needed, cache key should be extended and this test updated. 277e476
Addresses Copilot review feedback on MemPalace#661. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bensig
left a comment
There was a problem hiding this comment.
Two issues found in code review:
-
No thread safety on cache globals (score 95) —
_graph_cache_nodes,_graph_cache_edges,_graph_cache_timeare read and written without any lock. Concurrent requests can read stale or partially-written cache. Add athreading.Lockaround the read/write paths. -
Warm cache silently ignores
colargument (score 82) —build_graph(col=different_col)returns cached result for the original collection with no indication. Consider keying the cache on the collection identity, or at minimum documenting this behavior in the docstring and logging when cached args are ignored.
|
Closing duplicate PR created by network timeout. |
3 similar comments
|
Closing duplicate PR created by network timeout. |
|
Closing duplicate PR created by network timeout. |
|
Closing duplicate PR created by network timeout. |
Addresses Copilot review feedback on MemPalace#661. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
277e476 to
cf8034c
Compare
Addresses Copilot review feedback on MemPalace#661. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cf8034c to
509120c
Compare
|
@bensig Both issues from your review are already addressed in the current branch:
Branch has been rebased on latest |
build_graph() scans every drawer's metadata in 1000-item batches on every call — O(n) per graph build with no caching. At 50K+ drawers this costs several seconds per MCP tool call (traverse, find_tunnels, graph_stats all call build_graph on every invocation). Add a module-level cache (nodes + edges + timestamp) with a 60-second TTL. Cache is invalidated via invalidate_graph_cache(), exported for write operations to call. Tests updated with setup_method cache resets and two new tests verifying cache hit and invalidation behaviour.
Addresses Copilot review feedback on MemPalace#661. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback from @bensig: 1. Wrap cache reads/writes in threading.Lock for thread safety 2. Promote the col-arg caveat from inline comment to docstring Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
79e78db to
8adf35a
Compare
fork-direction.md and TODO-fork-improvements.md were scattered strategic thinking that belongs in the front-door README: competitive landscape, roadmap (P0-P6), and open problems. Merges them in and deletes the standalone files. PR-status tables (README + CLAUDE.md) were stale after today's rebases of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863. test_readme_claims.py skips MCP-tool-table and dialect-reference checks when absent — our slimmed fork README doesn't reproduce upstream's tool table structure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Ping @bensig — the changes you requested on 2026-04-12 have been addressed in the follow-up commit on 2026-04-16 (added |
Every remaining row in "Still ahead of upstream" now carries a status so the reader can tell at a glance whether each change is being upstreamed, pending a PR, or deliberately fork-local. Dropped: - "Guard ChromaDB 1.5.x metadata-mismatch segfault" — this row was overstated. The memory file for today's debugging notes that the try-get/except-create pattern is defensive code that never reproduced a specific crash (the actual crashes traced to HNSW drift). Leaving it in "Still ahead" implied an upstream-candidate fix, which it isn't. Code stays in place as defensive, but the README no longer claims it as a fork-ahead feature. Moved to Superseded: - "Stale HNSW mtime detection + mempalace_reconnect" — upstream took a different approach in MemPalace#757. Our broader inode+mtime detection and the mempalace_reconnect MCP tool remain as fork-local convenience; they're just not "ahead of upstream" anymore. Statuses now populated: - Linked PR number for the 7 changes with active upstream PRs (MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673 with APPROVED note, MemPalace#999, MemPalace#1000, MemPalace#1005). - "PR pending" for 3 items that are good candidates but unfiled: epsilon mtime comparison, max_distance parameter, tool output mining. - "fork-only" for 2 items we keep intentionally without pitching upstream: .blob_seq_ids_migrated marker (narrow), bulk_check_mined (complementary to upstream's MemPalace#784 file-locking). Legend sentence added above the table explains the three status values. 42 README-claim tests pass.
…#1023 merged, MemPalace#673 rebased - Bump fork-ahead section header to "after v3.3.2" - Strike #11 (quarantine_stale_hnsw) and MemPalace#18 (PID file guard) as merged-into-upstream-via-v3.3.2, keep entries for traceability - Add v3.3.2-shipped items to "Merged into upstream (post-v3.3.1)" - Rebuild PR table: 10 merged / 7 open / 7 closed; add MemPalace#1024 row, reclassify MemPalace#681/MemPalace#1000/MemPalace#1023 as merged, note MemPalace#673 rebased 2026-04-21 - Annotate MemPalace#661 status with the GitHub review-state machine caveat (CHANGES_REQUESTED persists until reviewer dismisses, not owed) - Bump test count 1063 → 1096 post-merge
…unts - Upstream released v3.3.2 on 2026-04-21 (our MemPalace#681/MemPalace#1000/MemPalace#1023) - Drawer count 152,682 → 165,632; wings 23 → 28; tests 1063 → 1096 - MemPalace#673 now MERGEABLE after fresh rebase + squash to 1 commit - MemPalace#661 status clarified: CHANGES_REQUESTED persists until reviewer dismisses, not an outstanding owe - Merged-upstream section split out v3.3.2 release group
bensig
left a comment
There was a problem hiding this comment.
Threading.Lock usage correct. Cache invalidation sound (write-invalidation + 60s TTL fallback). Both original review items addressed. CI green across all platforms.
… (2026-04-22) Ben's batched queue-clear pass merged four PRs at 00:38 UTC: graph cache (MemPalace#661), deterministic hook saves (MemPalace#673), Claude Code 2.1.114 hook stdout + silent_save guard (MemPalace#1021), and upstream's own MemPalace#851 pagination fix (closing MemPalace#1036 as superseded). Moved four rows out of the "Fork Changes" / "Fork change queue" tables into their respective merged-upstream history sections. Intro sentence PR count reduced from 7 → 4 open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore-integrity release. Unbreaks fresh `pip install mempalace` from v3.3.2 by re-tagging current develop, which carries both the plugin.json consumer (shipped in 3.3.2) and the matching mempalace-mcp entry point in pyproject.toml (added on develop ~10h after the 3.3.2 tag via MemPalace#340 by @messelink). MemPalace#1093 diagnosed by @jphein. Bumps (all 5 sources agree per Version Guard / CLAUDE.md): - mempalace/version.py 3.3.2 → 3.3.3 - pyproject.toml 3.3.2 → 3.3.3 - .claude-plugin/plugin.json 3.3.2 → 3.3.3 - .claude-plugin/marketplace.json 3.3.2 → 3.3.3 - .codex-plugin/plugin.json 3.3.2 → 3.3.3 - CHANGELOG.md new [3.3.3] entry No code changes. The fix for MemPalace#1093 is already on develop via merged PRs MemPalace#340, MemPalace#1021, MemPalace#851, MemPalace#942, MemPalace#833, MemPalace#673, MemPalace#661, MemPalace#659, MemPalace#1097, MemPalace#1051, MemPalace#1001, MemPalace#945. Branch name intentionally outside the `release/*` ruleset so follow-up CI-fix commits aren't gated behind a nested PR. (Supersedes MemPalace#1143 — closed for exactly that reason after it missed 3 of 5 version files.) Smoke-tested locally from a fresh develop clone: grep mempalace-mcp pyproject.toml .claude-plugin/plugin.json # both ✓ python -m build --wheel # ✓ pip install …-py3-none-any.whl # ✓ which mempalace-mcp # ✓ mempalace-mcp --help # ✓
…safe graph cache 31 commits including: - v3.3.3 release (restores pip install integrity post-v3.3.2 MemPalace#1093 regression) - MemPalace#1097 empty-string filter normalization in mempalace_search - MemPalace#659 diary wing parameter (our fork PR, now upstream) - MemPalace#851/MemPalace#1097/MemPalace#1021/MemPalace#661/MemPalace#673 incorporated - website Crystal Lattice brand refresh - thread-safe graph cache in palace_graph.py Conflict resolutions (10 files): - README.md keep fork version; bump badge 3.3.1 → 3.3.3 for test compat - hooks/README.md keep fork silent/block architecture docs; keep MEMPAL_PYTHON (correct for legacy hook, upstream's rename is stale) - examples/HOOKS_TUTORIAL.md same treatment - mcp_server.py take upstream's sanitize_name(wing) — strictly better than our crude lowercase+underscore normalization - miner.py keep fork 10K batch size + comma formatting on status; adopt upstream's pagination rationale comment - palace_graph.py take upstream entirely — thread-safety improvements layered on top of our MemPalace#661 (which upstream already merged) - hooks_cli.py take upstream (Windows path-separator compat in _wing_from_transcript_path, Codex CLI format in _extract_recent_messages), then re-apply fork-ahead: use _wing_from_transcript_path in _ingest_transcript instead of hardcoded "sessions" — keeps transcript mining coherent with the diary wing derivation from MemPalace#659 - tests/test_hooks_cli.py take upstream's updated wing-kwarg assertions and new test_stop_hook_derives_wing_from_transcript_path; take upstream's mock-based security test (simpler than our three-way assertion, same property tested) Post-merge test state: - 1096 passed, 10 failed in tests/test_claude_plugin_hook_wrappers.py - The 10 failures are the fork-ahead MemPalace#19 divergence already documented in CLAUDE.md: our venv-aware hooks use `dirname`/`cat` which the test's scrubbed-PATH environment doesn't provide. Same class that correctly caught MemPalace#1115 and led us to withdraw it pending MemPalace#1069 arbitration. Expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Problem
build_graph()scans every drawer's metadata in 1000-item batches on every call — O(n) per graph build with no caching. At 50K+ drawers this costs several seconds per MCP tool call becausetraverse(),find_tunnels(), andgraph_stats()all callbuild_graph()on every invocation.Solution
Module-level cache (
_graph_cache_nodes,_graph_cache_edges,_graph_cache_time) with a 60-second TTL, following the same pattern as_metadata_cacheinmcp_server.py.build_graph()returns the cached result if < 60 seconds oldinvalidate_graph_cache()is exported for write operations to call when the palace changesChanges
mempalace/palace_graph.py— addimport time, module-level cache globals,invalidate_graph_cache()function, TTL check at top ofbuild_graph(), cache-store at bottom ofbuild_graph()tests/test_palace_graph.py— importinvalidate_graph_cache, addsetup_methodto each test class that exercisesbuild_graph(prevents cross-test cache pollution), addtest_cache_returns_same_resultandtest_invalidate_clears_cacheTest results
All 23 tests pass including 2 new cache-specific tests.