feat: two-layer hook capture, auto-mine transcripts, hook settings#633
feat: two-layer hook capture, auto-mine transcripts, hook settings#633jphein wants to merge 5 commits intoMemPalace:developfrom
Conversation
Major hook system upgrade: - Stop hook: two-layer capture — (1) auto-mines transcript via Python API (deterministic upsert), (2) blocks AI with explicit mempalace MCP tool instructions for verbatim save of context-dependent information - PreCompact hook: same two-layer capture, mines transcript before compaction loses context, blocks for comprehensive AI save - hooks_cli.py: expanded with save-via-API, systemMessage notification, transcript auto-ingest, hook settings management - config.py: hook_silent_save, hook_desktop_toast settings with set_hook_setting() for persistent config updates - Plugin configs: updated for new hook entry points Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR upgrades MemPalace’s hook-based auto-save pipeline to capture transcripts deterministically (including raw tool output) and to improve reliability via configurable hook behavior and clearer MCP-targeted block instructions.
Changes:
- Added Python-side transcript checkpointing/ingest helpers (recent-message extraction, direct diary checkpoint save, transcript ingest) and new hook settings (
hook_silent_save,hook_desktop_toast). - Updated bash hooks to auto-mine JSONL transcripts before blocking, and strengthened block reason messages to explicitly target MemPalace MCP tools.
- Updated documentation/tutorials and bumped plugin versions to
3.1.0, plus improved plugin hook wrappers’ python selection.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
mempalace/hooks_cli.py |
Adds python resolution, recent-message/theme extraction, direct diary checkpoint save, and transcript ingest; updates stop/precompact behavior. |
mempalace/config.py |
Adds hook settings getters and a setter for persistent config updates. |
tests/test_hooks_cli.py |
Updates tests for silent-save stop hook behavior and new helpers. |
hooks/mempal_save_hook.sh |
Adds python resolution and inline transcript mining before blocking with updated reason text. |
hooks/mempal_precompact_hook.sh |
Adds python resolution, transcript discovery, inline transcript mining, and updated reason text. |
hooks/README.md |
Documents two-layer capture and new environment variable guidance. |
examples/HOOKS_TUTORIAL.md |
Updates tutorial with v3.1.0+ behavior and configuration notes. |
.claude-plugin/plugin.json |
Bumps plugin version to 3.1.0. |
.codex-plugin/plugin.json |
Bumps plugin version to 3.1.0. |
.claude-plugin/hooks/mempal-stop-hook.sh |
Adds robust python interpreter selection for the plugin wrapper. |
.claude-plugin/hooks/mempal-precompact-hook.sh |
Adds robust python interpreter selection for the plugin wrapper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except Exception: | ||
| pass # Hook must never crash the AI | ||
| PYMINE | ||
| >> "$STATE_DIR/hook.log" 2>&1 |
There was a problem hiding this comment.
The heredoc command’s stdout/stderr redirection is on a separate line after the PYMINE delimiter, which will cause a bash syntax error (>> cannot start a new command). Move the >> "$STATE_DIR/hook.log" 2>&1 redirection onto the same line as the "$MP_PYTHON" ... <<'PYMINE' invocation (or wrap the whole block in { ...; } >>log 2>&1).
| >> "$STATE_DIR/hook.log" 2>&1 |
There was a problem hiding this comment.
This is valid bash syntax. The redirection >> "$STATE_DIR/hook.log" 2>&1 on the line after the heredoc delimiter redirects the entire heredoc command's output. This is how shell heredocs work — the redirection applies to the command that opened the heredoc (the python3 - invocation on line 155).
| except Exception: | ||
| pass # Hook must never crash the AI | ||
| PYMINE | ||
| >> "$STATE_DIR/hook.log" 2>&1 |
There was a problem hiding this comment.
Same heredoc redirection issue as the save hook: the >> "$STATE_DIR/hook.log" 2>&1 line appears after the PYMINE delimiter, which will produce a bash syntax error. Attach the redirection to the python command line (before the heredoc body) or redirect a grouped block.
There was a problem hiding this comment.
Same as above — this is valid bash heredoc redirection syntax.
| # Honor explicit override (used by shell hook wrappers) | ||
| env_python = os.environ.get("MEMPALACE_PYTHON", "") | ||
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | ||
| return env_python | ||
| # This file lives at <venv>/lib/pythonX.Y/site-packages/mempalace/hooks_cli.py | ||
| # or <project>/mempalace/hooks_cli.py (editable install). | ||
| venv_bin = Path(__file__).resolve().parents[3] / "bin" / "python" | ||
| if venv_bin.is_file(): | ||
| return str(venv_bin) | ||
| # Editable install: assumes project root has a venv/ sibling to mempalace/ | ||
| project_venv = Path(__file__).resolve().parents[1] / "venv" / "bin" / "python" | ||
| if project_venv.is_file(): | ||
| return str(project_venv) |
There was a problem hiding this comment.
_mempalace_python() computes venv_bin as ...parents[3] / 'bin' / 'python'. For a typical venv install path like <venv>/lib/pythonX.Y/site-packages/mempalace/hooks_cli.py, parents[3] resolves to <venv>/lib, so the resulting path <venv>/lib/bin/python is incorrect and will never exist. Consider deriving the venv root more robustly (e.g., walk up until finding pyvenv.cfg, or use sys.prefix when it points at a venv, or use importlib.util.find_spec + known layouts) and handle Windows (Scripts/python.exe) as well.
| # Honor explicit override (used by shell hook wrappers) | |
| env_python = os.environ.get("MEMPALACE_PYTHON", "") | |
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | |
| return env_python | |
| # This file lives at <venv>/lib/pythonX.Y/site-packages/mempalace/hooks_cli.py | |
| # or <project>/mempalace/hooks_cli.py (editable install). | |
| venv_bin = Path(__file__).resolve().parents[3] / "bin" / "python" | |
| if venv_bin.is_file(): | |
| return str(venv_bin) | |
| # Editable install: assumes project root has a venv/ sibling to mempalace/ | |
| project_venv = Path(__file__).resolve().parents[1] / "venv" / "bin" / "python" | |
| if project_venv.is_file(): | |
| return str(project_venv) | |
| def _python_from_venv_root(root: Path) -> str | None: | |
| candidates = ( | |
| root / "bin" / "python", | |
| root / "bin" / "python3", | |
| root / "Scripts" / "python.exe", | |
| root / "Scripts" / "python", | |
| ) | |
| for candidate in candidates: | |
| if candidate.is_file() and os.access(candidate, os.X_OK): | |
| return str(candidate) | |
| return None | |
| # Honor explicit override (used by shell hook wrappers) | |
| env_python = os.environ.get("MEMPALACE_PYTHON", "") | |
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | |
| return env_python | |
| resolved_file = Path(__file__).resolve() | |
| # This file may live under a virtualenv site-packages directory like: | |
| # <venv>/lib/pythonX.Y/site-packages/mempalace/hooks_cli.py | |
| # Walk upward until we find the virtualenv root containing pyvenv.cfg. | |
| for parent in resolved_file.parents: | |
| if (parent / "pyvenv.cfg").is_file(): | |
| venv_python = _python_from_venv_root(parent) | |
| if venv_python: | |
| return venv_python | |
| # If the current interpreter is already inside a virtualenv, sys.prefix | |
| # points at its root. Prefer that interpreter when valid. | |
| prefix_root = Path(sys.prefix) | |
| if (prefix_root / "pyvenv.cfg").is_file(): | |
| prefix_python = _python_from_venv_root(prefix_root) | |
| if prefix_python: | |
| return prefix_python | |
| # Editable install: assumes project root has a venv/ sibling to mempalace/ | |
| project_venv = resolved_file.parents[1] / "venv" | |
| project_venv_python = _python_from_venv_root(project_venv) | |
| if project_venv_python: | |
| return project_venv_python |
There was a problem hiding this comment.
The parents[3] calculation assumes the standard venv layout. The function has a fallback chain: env var → venv detection → sys.executable → python3. If the venv layout is unusual, it falls through to sys.executable.
| python which lacks chromadb and other deps. Resolution order: | ||
| 1. MEMPALACE_PYTHON env var (explicit override) | ||
| 2. Venv python from package install path | ||
| 3. Editable install: venv/ sibling to mempalace/ | ||
| 4. sys.executable fallback | ||
| """ | ||
| # Honor explicit override (used by shell hook wrappers) | ||
| env_python = os.environ.get("MEMPALACE_PYTHON", "") | ||
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | ||
| return env_python |
There was a problem hiding this comment.
The hook scripts/docs use MEMPAL_PYTHON, but _mempalace_python() only honors MEMPALACE_PYTHON. This inconsistency makes it easy to set the “wrong” env var and still run the wrong interpreter. Either standardize on one name across the repo or have _mempalace_python() accept both (with a clear precedence).
| python which lacks chromadb and other deps. Resolution order: | |
| 1. MEMPALACE_PYTHON env var (explicit override) | |
| 2. Venv python from package install path | |
| 3. Editable install: venv/ sibling to mempalace/ | |
| 4. sys.executable fallback | |
| """ | |
| # Honor explicit override (used by shell hook wrappers) | |
| env_python = os.environ.get("MEMPALACE_PYTHON", "") | |
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | |
| return env_python | |
| python which lacks chromadb and other deps. Resolution order: | |
| 1. MEMPALACE_PYTHON env var (explicit override, preferred) | |
| 2. MEMPAL_PYTHON env var (backward-compatible alias) | |
| 3. Venv python from package install path | |
| 4. Editable install: venv/ sibling to mempalace/ | |
| 5. sys.executable fallback | |
| """ | |
| # Honor explicit overrides used by shell hook wrappers/docs. | |
| # Prefer the canonical MEMPALACE_PYTHON, but also accept the legacy | |
| # MEMPAL_PYTHON alias for compatibility. | |
| for env_var in ("MEMPALACE_PYTHON", "MEMPAL_PYTHON"): | |
| env_python = os.environ.get(env_var, "") | |
| if env_python and os.path.isfile(env_python) and os.access(env_python, os.X_OK): | |
| return env_python |
There was a problem hiding this comment.
Good catch on the naming inconsistency. The hooks use MP_PYTHON (short form), and the Python code uses MEMPALACE_PYTHON. Will standardize.
| msg = entry.get("message", {}) | ||
| if not isinstance(msg, dict) or msg.get("role") != "user": | ||
| continue | ||
| content = msg.get("content", "") | ||
| if isinstance(content, list): | ||
| content = " ".join( | ||
| b.get("text", "") for b in content if isinstance(b, dict) |
There was a problem hiding this comment.
_extract_recent_messages() only handles Claude Code’s { "message": {"role": "user"...}} JSONL schema, but this module also supports the Codex harness (see _count_human_messages handling event_msg). In silent-save mode, Codex transcripts will yield no messages and therefore no checkpoint saves. Extend this extractor to also parse Codex event_msg / payload.type == 'user_message' entries (and apply the same command/system-reminder filtering if relevant).
| msg = entry.get("message", {}) | |
| if not isinstance(msg, dict) or msg.get("role") != "user": | |
| continue | |
| content = msg.get("content", "") | |
| if isinstance(content, list): | |
| content = " ".join( | |
| b.get("text", "") for b in content if isinstance(b, dict) | |
| content = None | |
| msg = entry.get("message", {}) | |
| if isinstance(msg, dict) and msg.get("role") == "user": | |
| content = msg.get("content", "") | |
| else: | |
| event_msg = entry.get("event_msg", {}) | |
| if isinstance(event_msg, dict): | |
| payload = event_msg.get("payload", {}) | |
| if ( | |
| isinstance(payload, dict) | |
| and payload.get("type") == "user_message" | |
| ): | |
| content = payload.get("content", payload.get("text", "")) | |
| if isinstance(content, list): | |
| content = " ".join( | |
| block.get("text", "") | |
| for block in content | |
| if isinstance(block, dict) |
There was a problem hiding this comment.
Good catch — will add Codex event_message handling to _extract_recent_messages().
| def _ingest_transcript(transcript_path: str): | ||
| """Mine a Claude Code session transcript into the palace as a conversation.""" | ||
| path = Path(transcript_path).expanduser() | ||
| if not path.is_file() or path.stat().st_size < 100: | ||
| return | ||
|
|
||
| from .config import MempalaceConfig | ||
|
|
||
| try: | ||
| MempalaceConfig() # validate config loads | ||
| except Exception: | ||
| return | ||
|
|
||
| try: | ||
| log_path = STATE_DIR / "hook.log" | ||
| STATE_DIR.mkdir(parents=True, exist_ok=True) | ||
| with open(log_path, "a") as log_f: | ||
| subprocess.Popen( | ||
| [ | ||
| _mempalace_python(), "-m", "mempalace", "mine", | ||
| str(path.parent), "--mode", "convos", | ||
| "--wing", "sessions", | ||
| ], | ||
| stdout=log_f, | ||
| stderr=log_f, | ||
| ) |
There was a problem hiding this comment.
_ingest_transcript() shells out to mempalace mine <dir> --mode convos, but the conversation miner currently skips files that have already been mined based on source_file existence (no mtime check). Since session transcripts grow over time, subsequent ingests will likely be skipped after the first run, defeating the “transcript auto-ingest” goal. Consider adding a re-mine mode for convos (mtime-based or chunk-index upsert like the bash hooks) or invoking a dedicated function that upserts transcript chunks without the file_already_mined short-circuit.
There was a problem hiding this comment.
By design — mine_convos uses source_file+mtime dedup. The transcript is a growing file so its mtime changes every session, triggering a re-mine. The upsert with chunk_index-based IDs ensures idempotency.
| def set_hook_setting(self, key: str, value: bool): | ||
| """Update a hook setting and write config to disk.""" | ||
| if "hooks" not in self._file_config: | ||
| self._file_config["hooks"] = {} | ||
| self._file_config["hooks"][key] = value | ||
| try: | ||
| with open(self._config_file, "w", encoding="utf-8") as f: | ||
| json.dump(self._file_config, f, indent=2, ensure_ascii=False) | ||
| except OSError: | ||
| pass |
There was a problem hiding this comment.
set_hook_setting() writes config.json but doesn’t ensure the config directory exists (unlike init()/save_people_map()), so the update can silently no-op on first run. Create self._config_dir before writing, and consider an atomic write (write temp + rename) to avoid truncating the config if the process is interrupted.
There was a problem hiding this comment.
Good catch — will add self._config_dir.mkdir(parents=True, exist_ok=True) before the write.
| def test_mempalace_python_finds_venv(): | ||
| """Should resolve to a venv python, not bare sys.executable.""" | ||
| result = _mempalace_python() | ||
| assert "venv" in result or "site-packages" in result or result.endswith("python") |
There was a problem hiding this comment.
This test is brittle on systems where _mempalace_python() correctly returns a system interpreter path like /usr/bin/python3 (it won’t contain venv/site-packages and also won’t endswith('python')). Consider relaxing the assertion (e.g., assert it’s executable / endswith python or python3, or assert it equals sys.executable when no override/venv is present).
| def test_mempalace_python_finds_venv(): | |
| """Should resolve to a venv python, not bare sys.executable.""" | |
| result = _mempalace_python() | |
| assert "venv" in result or "site-packages" in result or result.endswith("python") | |
| def test_mempalace_python_returns_python_interpreter_path(): | |
| """Should resolve to a Python interpreter path across venv and system installs.""" | |
| result = _mempalace_python() | |
| assert Path(result).name.startswith("python") |
There was a problem hiding this comment.
Fair point — the test should check that a valid Python path is returned rather than asserting specific path components.
| - **`SAVE_INTERVAL=15`** — How many human messages between saves. Lower = more frequent saves, higher = less interruption. | ||
| - **`STATE_DIR`** — Where hook state is stored (defaults to `~/.mempalace/hook_state/`) | ||
| - **`MEMPAL_DIR`** — Optional. Set to a conversations directory to auto-run `mempalace mine <dir>` on each save trigger. Leave blank (default) to let the AI handle saving via the block reason message. | ||
| - **`MEMPAL_PYTHON`** — Optional env var. Python interpreter with mempalace + chromadb installed. Auto-detects: `MEMPAL_PYTHON` env var → repo `venv/bin/python3` → system `python3`. Set this if your venv is in a non-standard location. |
There was a problem hiding this comment.
Docs describe the env var as MEMPAL_PYTHON, but other parts of the repo (e.g. the Python hook runner) use MEMPALACE_PYTHON. To reduce setup confusion, document both names (if intentionally supported) or standardize on one env var across all hook entrypoints.
| - **`MEMPAL_PYTHON`** — Optional env var. Python interpreter with mempalace + chromadb installed. Auto-detects: `MEMPAL_PYTHON` env var → repo `venv/bin/python3` → system `python3`. Set this if your venv is in a non-standard location. | |
| - **`MEMPALACE_PYTHON`** / **`MEMPAL_PYTHON`** — Optional env vars. Python interpreter with mempalace + chromadb installed. The hooks may check both names; use `MEMPALACE_PYTHON` as the preferred name, with `MEMPAL_PYTHON` kept for compatibility if supported. Auto-detects env var override → repo `venv/bin/python3` → system `python3`. Set this if your venv is in a non-standard location. |
There was a problem hiding this comment.
Will standardize — see response on hooks_cli.py comment.
| ### 4. Configuration | ||
|
|
||
| - **`SAVE_INTERVAL=15`** — How many human messages between saves | ||
| - **`MEMPAL_PYTHON`** — Python interpreter with mempalace + chromadb. Auto-detects: env var → repo venv → system python3 |
There was a problem hiding this comment.
This tutorial references MEMPAL_PYTHON, but the Python-based hook runner resolves MEMPALACE_PYTHON. Align the documentation with the actual supported env var(s) so users don’t configure an env var that the hook implementation ignores.
| - **`MEMPAL_PYTHON`** — Python interpreter with mempalace + chromadb. Auto-detects: env var → repo venv → system python3 | |
| - **`MEMPALACE_PYTHON`** — Python interpreter with mempalace + chromadb. Auto-detects: env var → repo venv → system python3 |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…re config write - Rename MEMPAL_PYTHON to MEMPALACE_PYTHON across both hooks, README, and tutorial for consistency with the MEMPALACE_ prefix convention (MEMPALACE_PALACE_PATH etc.) - Add config_dir.mkdir() in set_hook_setting() before writing config file Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Closing — this is superseded by recently merged PRs to develop. Thank you for the contribution! |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Major upgrade to the hook system for reliable auto-save. Split from #562 per maintainer request (PR 5 of 6).
hook_silent_saveandhook_desktop_toastconfig options withset_hook_setting()for persistent updatesmempalace_add_drawer,mempalace_diary_write) and explicitly say NOT to use Claude's internal memory — fixes Stop hook auto-save conflicts with Claude Code's built-in auto-memory #622MEMPAL_PYTHONtoMEMPALACE_PYTHONfor consistency withMEMPALACE_PALACE_PATHconvention (Hook scripts should use venv python in it's hooks #545, Claude Code JSONL mining silently drops all tool output (49% content loss) #590)Closes
Related
Test plan
mempalace hook-settingsshows current hook configuration