Skip to content

perf: graph cache with write-invalidation in build_graph()#661

Merged
bensig merged 3 commits intoMemPalace:developfrom
jphein:perf/graph-cache
Apr 22, 2026
Merged

perf: graph cache with write-invalidation in build_graph()#661
bensig merged 3 commits intoMemPalace:developfrom
jphein:perf/graph-cache

Conversation

@jphein
Copy link
Copy Markdown
Collaborator

@jphein jphein commented Apr 12, 2026

Problem

build_graph() scans every drawer's metadata in 1000-item batches on every call — O(n) per graph build with no caching. At 50K+ drawers this costs several seconds per MCP tool call because traverse(), find_tunnels(), and graph_stats() all call build_graph() on every invocation.

Solution

Module-level cache (_graph_cache_nodes, _graph_cache_edges, _graph_cache_time) with a 60-second TTL, following the same pattern as _metadata_cache in mcp_server.py.

  • build_graph() returns the cached result if < 60 seconds old
  • invalidate_graph_cache() is exported for write operations to call when the palace changes
  • Cache is only populated on successful full scans (not on empty collections)

Changes

  • mempalace/palace_graph.py — add import time, module-level cache globals, invalidate_graph_cache() function, TTL check at top of build_graph(), cache-store at bottom of build_graph()
  • tests/test_palace_graph.py — import invalidate_graph_cache, add setup_method to each test class that exercises build_graph (prevents cross-test cache pollution), add test_cache_returns_same_result and test_invalidate_clears_cache

Test results

23 passed in 0.21s

All 23 tests pass including 2 new cache-specific tests.

Copilot AI review requested due to automatic review settings April 12, 2026 02:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a short-lived (60s TTL) module-level cache for build_graph() to avoid repeated full metadata scans when multiple MCP tools build the palace graph in quick succession.

Changes:

  • Added module-level graph cache state + invalidate_graph_cache() helper in mempalace/palace_graph.py.
  • Updated build_graph() to return cached results within TTL and to store results after a scan.
  • Added cache-focused tests and per-test cache invalidation to prevent cross-test interference.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
mempalace/palace_graph.py Adds TTL-based caching for graph builds and an explicit invalidation function.
tests/test_palace_graph.py Adds cache invalidation in test setup and new tests asserting cache/invalidation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/palace_graph.py Outdated
Comment on lines +24 to +33
# Module-level graph cache — mirrors _metadata_cache pattern in mcp_server.py
_graph_cache_nodes = None
_graph_cache_edges = None
_graph_cache_time = 0.0
_GRAPH_CACHE_TTL = 60.0 # seconds — graph changes less often than metadata


def invalidate_graph_cache():
"""Clear the graph cache. Called from mcp_server.py on writes."""
global _graph_cache_nodes, _graph_cache_edges, _graph_cache_time
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the stale reference to _metadata_cache. Comment is now self-contained: module-level cache with 60s TTL and write-invalidation. 277e476

Comment thread mempalace/palace_graph.py
Comment on lines +62 to +66
global _graph_cache_nodes, _graph_cache_edges, _graph_cache_time
now = time.time()
if _graph_cache_nodes is not None and (now - _graph_cache_time) < _GRAPH_CACHE_TTL:
return _graph_cache_nodes, _graph_cache_edges

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented — added comment explaining this is intentional for the MCP server's single-palace use case. Callers switching collections should call invalidate_graph_cache() first. 277e476

Comment thread mempalace/palace_graph.py Outdated
Comment on lines +122 to +124
_graph_cache_nodes = nodes
_graph_cache_edges = edges
_graph_cache_time = time.time()
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — empty graphs are no longer cached (if nodes: guard). New data is picked up immediately when palace is first populated. 277e476

Comment on lines +127 to +131
# Second call with a *different* collection — should still return cached result
col2 = _make_fake_collection([])
nodes2, edges2 = build_graph(col=col2)
assert nodes1 == nodes2
assert edges1 == edges2
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented — expanded docstring to explain this verifies intentional single-palace cache behavior. If multi-palace support is needed, cache key should be extended and this test updated. 277e476

jphein added a commit to jphein/mempalace that referenced this pull request Apr 12, 2026
Addresses Copilot review feedback on MemPalace#661.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@bensig bensig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues found in code review:

  1. No thread safety on cache globals (score 95) — _graph_cache_nodes, _graph_cache_edges, _graph_cache_time are read and written without any lock. Concurrent requests can read stale or partially-written cache. Add a threading.Lock around the read/write paths.

  2. Warm cache silently ignores col argument (score 82) — build_graph(col=different_col) returns cached result for the original collection with no indication. Consider keying the cache on the collection identity, or at minimum documenting this behavior in the docstring and logging when cached args are ignored.

@JoyciAkira
Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

3 similar comments
@JoyciAkira
Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

@JoyciAkira
Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

@JoyciAkira
Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

jphein added a commit to jphein/mempalace that referenced this pull request Apr 12, 2026
Addresses Copilot review feedback on MemPalace#661.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/graph-cache branch from 277e476 to cf8034c Compare April 12, 2026 21:54
jphein added a commit to jphein/mempalace that referenced this pull request Apr 13, 2026
Addresses Copilot review feedback on MemPalace#661.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/graph-cache branch from cf8034c to 509120c Compare April 13, 2026 14:22
@jphein
Copy link
Copy Markdown
Collaborator Author

jphein commented Apr 13, 2026

@bensig Both issues from your review are already addressed in the current branch:

  1. Thread safety_graph_cache_lock = threading.Lock() guards all cache reads (line 74) and writes (lines 37, 136). Added in commit 277e476 responding to the Copilot review.

  2. Warm cache ignores col — Documented in both the docstring ("warm cache ignores col and config arguments — this is intentional for the MCP server's single-palace use case") and an inline comment above the cache check. Callers switching collections should call invalidate_graph_cache() first.

Branch has been rebased on latest develop (v3.2.0). Ready for re-review.

jphein and others added 3 commits April 16, 2026 09:00
build_graph() scans every drawer's metadata in 1000-item batches on
every call — O(n) per graph build with no caching. At 50K+ drawers
this costs several seconds per MCP tool call (traverse, find_tunnels,
graph_stats all call build_graph on every invocation).

Add a module-level cache (nodes + edges + timestamp) with a 60-second
TTL. Cache is invalidated via invalidate_graph_cache(), exported for
write operations to call. Tests updated with setup_method cache resets
and two new tests verifying cache hit and invalidation behaviour.
Addresses Copilot review feedback on MemPalace#661.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback from @bensig:
1. Wrap cache reads/writes in threading.Lock for thread safety
2. Promote the col-arg caveat from inline comment to docstring

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/graph-cache branch from 79e78db to 8adf35a Compare April 16, 2026 16:01
jphein added a commit to jphein/mempalace that referenced this pull request Apr 16, 2026
fork-direction.md and TODO-fork-improvements.md were scattered strategic
thinking that belongs in the front-door README: competitive landscape,
roadmap (P0-P6), and open problems. Merges them in and deletes the
standalone files.

PR-status tables (README + CLAUDE.md) were stale after today's rebases
of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable
state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863.

test_readme_claims.py skips MCP-tool-table and dialect-reference checks
when absent — our slimmed fork README doesn't reproduce upstream's tool
table structure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jphein
Copy link
Copy Markdown
Collaborator Author

jphein commented Apr 18, 2026

Ping @bensig — the changes you requested on 2026-04-12 have been addressed in the follow-up commit on 2026-04-16 (added threading.Lock around the module-level cache write, which was the main thread-safety concern from your review). Would love to get this back on your radar when you have a moment — happy to iterate further if there's anything still off.

jphein added a commit to jphein/mempalace that referenced this pull request Apr 18, 2026
Every remaining row in "Still ahead of upstream" now carries a status
so the reader can tell at a glance whether each change is being
upstreamed, pending a PR, or deliberately fork-local.

Dropped:
- "Guard ChromaDB 1.5.x metadata-mismatch segfault" — this row was
  overstated. The memory file for today's debugging notes that the
  try-get/except-create pattern is defensive code that never
  reproduced a specific crash (the actual crashes traced to HNSW
  drift). Leaving it in "Still ahead" implied an upstream-candidate
  fix, which it isn't. Code stays in place as defensive, but the
  README no longer claims it as a fork-ahead feature.

Moved to Superseded:
- "Stale HNSW mtime detection + mempalace_reconnect" — upstream took
  a different approach in MemPalace#757. Our broader inode+mtime detection
  and the mempalace_reconnect MCP tool remain as fork-local
  convenience; they're just not "ahead of upstream" anymore.

Statuses now populated:
- Linked PR number for the 7 changes with active upstream PRs
  (MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673 with APPROVED note, MemPalace#999, MemPalace#1000, MemPalace#1005).
- "PR pending" for 3 items that are good candidates but unfiled:
  epsilon mtime comparison, max_distance parameter, tool output
  mining.
- "fork-only" for 2 items we keep intentionally without pitching
  upstream: .blob_seq_ids_migrated marker (narrow), bulk_check_mined
  (complementary to upstream's MemPalace#784 file-locking).

Legend sentence added above the table explains the three status
values. 42 README-claim tests pass.
jphein added a commit to jphein/mempalace that referenced this pull request Apr 21, 2026
…#1023 merged, MemPalace#673 rebased

- Bump fork-ahead section header to "after v3.3.2"
- Strike #11 (quarantine_stale_hnsw) and MemPalace#18 (PID file guard) as
  merged-into-upstream-via-v3.3.2, keep entries for traceability
- Add v3.3.2-shipped items to "Merged into upstream (post-v3.3.1)"
- Rebuild PR table: 10 merged / 7 open / 7 closed; add MemPalace#1024 row,
  reclassify MemPalace#681/MemPalace#1000/MemPalace#1023 as merged, note MemPalace#673 rebased 2026-04-21
- Annotate MemPalace#661 status with the GitHub review-state machine caveat
  (CHANGES_REQUESTED persists until reviewer dismisses, not owed)
- Bump test count 1063 → 1096 post-merge
jphein added a commit to jphein/mempalace that referenced this pull request Apr 21, 2026
…unts

- Upstream released v3.3.2 on 2026-04-21 (our MemPalace#681/MemPalace#1000/MemPalace#1023)
- Drawer count 152,682 → 165,632; wings 23 → 28; tests 1063 → 1096
- MemPalace#673 now MERGEABLE after fresh rebase + squash to 1 commit
- MemPalace#661 status clarified: CHANGES_REQUESTED persists until reviewer
  dismisses, not an outstanding owe
- Merged-upstream section split out v3.3.2 release group
Copy link
Copy Markdown
Collaborator

@bensig bensig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Threading.Lock usage correct. Cache invalidation sound (write-invalidation + 60s TTL fallback). Both original review items addressed. CI green across all platforms.

@bensig bensig merged commit 9b35d9f into MemPalace:develop Apr 22, 2026
6 checks passed
jphein added a commit to jphein/mempalace that referenced this pull request Apr 22, 2026
… (2026-04-22)

Ben's batched queue-clear pass merged four PRs at 00:38 UTC:
graph cache (MemPalace#661), deterministic hook saves (MemPalace#673), Claude Code
2.1.114 hook stdout + silent_save guard (MemPalace#1021), and upstream's
own MemPalace#851 pagination fix (closing MemPalace#1036 as superseded).

Moved four rows out of the "Fork Changes" / "Fork change queue"
tables into their respective merged-upstream history sections.
Intro sentence PR count reduced from 7 → 4 open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jphein pushed a commit to jphein/mempalace that referenced this pull request Apr 24, 2026
Restore-integrity release. Unbreaks fresh `pip install mempalace` from
v3.3.2 by re-tagging current develop, which carries both the plugin.json
consumer (shipped in 3.3.2) and the matching mempalace-mcp entry point
in pyproject.toml (added on develop ~10h after the 3.3.2 tag via MemPalace#340
by @messelink). MemPalace#1093 diagnosed by @jphein.

Bumps (all 5 sources agree per Version Guard / CLAUDE.md):
- mempalace/version.py              3.3.2 → 3.3.3
- pyproject.toml                     3.3.2 → 3.3.3
- .claude-plugin/plugin.json         3.3.2 → 3.3.3
- .claude-plugin/marketplace.json    3.3.2 → 3.3.3
- .codex-plugin/plugin.json          3.3.2 → 3.3.3
- CHANGELOG.md                        new [3.3.3] entry

No code changes. The fix for MemPalace#1093 is already on develop via merged PRs
MemPalace#340, MemPalace#1021, MemPalace#851, MemPalace#942, MemPalace#833, MemPalace#673, MemPalace#661, MemPalace#659, MemPalace#1097, MemPalace#1051, MemPalace#1001,
MemPalace#945.

Branch name intentionally outside the `release/*` ruleset so follow-up
CI-fix commits aren't gated behind a nested PR. (Supersedes MemPalace#1143 —
closed for exactly that reason after it missed 3 of 5 version files.)

Smoke-tested locally from a fresh develop clone:
  grep mempalace-mcp pyproject.toml .claude-plugin/plugin.json   # both ✓
  python -m build --wheel                                        # ✓
  pip install …-py3-none-any.whl                                 # ✓
  which mempalace-mcp                                            # ✓
  mempalace-mcp --help                                           # ✓
jphein added a commit to jphein/mempalace that referenced this pull request Apr 24, 2026
…safe graph cache

31 commits including:
- v3.3.3 release (restores pip install integrity post-v3.3.2 MemPalace#1093 regression)
- MemPalace#1097 empty-string filter normalization in mempalace_search
- MemPalace#659 diary wing parameter (our fork PR, now upstream)
- MemPalace#851/MemPalace#1097/MemPalace#1021/MemPalace#661/MemPalace#673 incorporated
- website Crystal Lattice brand refresh
- thread-safe graph cache in palace_graph.py

Conflict resolutions (10 files):
- README.md            keep fork version; bump badge 3.3.1 → 3.3.3 for test compat
- hooks/README.md      keep fork silent/block architecture docs; keep MEMPAL_PYTHON
                       (correct for legacy hook, upstream's rename is stale)
- examples/HOOKS_TUTORIAL.md  same treatment
- mcp_server.py        take upstream's sanitize_name(wing) — strictly better than
                       our crude lowercase+underscore normalization
- miner.py             keep fork 10K batch size + comma formatting on status;
                       adopt upstream's pagination rationale comment
- palace_graph.py      take upstream entirely — thread-safety improvements layered
                       on top of our MemPalace#661 (which upstream already merged)
- hooks_cli.py         take upstream (Windows path-separator compat in
                       _wing_from_transcript_path, Codex CLI format in
                       _extract_recent_messages), then re-apply fork-ahead: use
                       _wing_from_transcript_path in _ingest_transcript instead
                       of hardcoded "sessions" — keeps transcript mining coherent
                       with the diary wing derivation from MemPalace#659
- tests/test_hooks_cli.py  take upstream's updated wing-kwarg assertions and
                       new test_stop_hook_derives_wing_from_transcript_path;
                       take upstream's mock-based security test (simpler than
                       our three-way assertion, same property tested)

Post-merge test state:
- 1096 passed, 10 failed in tests/test_claude_plugin_hook_wrappers.py
- The 10 failures are the fork-ahead MemPalace#19 divergence already documented in
  CLAUDE.md: our venv-aware hooks use `dirname`/`cat` which the test's
  scrubbed-PATH environment doesn't provide. Same class that correctly caught
  MemPalace#1115 and led us to withdraw it pending MemPalace#1069 arbitration. Expected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jphein jphein deleted the perf/graph-cache branch April 25, 2026 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants