Context
PR #1148 landed v1 of init entity detection: deterministic signal from package manifests (package.json, pyproject.toml, Cargo.toml, go.mod) + git authors, with an improved regex fallback for prose. That path is strong for codebases. It's still limited for prose-only corpora (notes, diaries, chat transcripts) where a short capitalized token can be a person, a place, a fictional character, or a common English word — regex alone cannot disambiguate.
This issue tracks phase 2: an opt-in LLM refinement step that takes the candidate set produced by phase 1 and reclassifies it with semantic understanding.
Goals
- Ship as opt-in only — default remains zero-API / local-first.
- Default provider is local (Ollama), so the feature can work fully offline on a consumer machine.
- Pluggable provider interface so a user with an API key can swap in a hosted model.
- Run interactively during
mempalace init with a visible progress indicator and clean cancellation (Ctrl-C → return partial results and proceed).
- Don't feed raw corpora to the model; feed regex-detected candidates + sampled context.
- Separately: deterministic parsing of
~/.claude/projects/-slug directory names as an extra project-name source (no LLM required, but useful for conversation-heavy users).
CLI shape
mempalace init <dir> # unchanged, no LLM
mempalace init <dir> --llm # opt-in, use configured default
mempalace init <dir> --llm-provider ollama --llm-model gemma3:4b
mempalace init <dir> --llm-provider openai-compat \
--llm-endpoint http://localhost:1234/v1 \
--llm-model <any>
mempalace init <dir> --llm-provider anthropic --llm-model claude-haiku-4-5
Persist defaults via mempalace config set llm.provider ollama / llm.model .... MEMPALACE_LLM_* env vars override.
Architecture
New discover_entities(dir, llm=None) layering:
1. project_scanner — manifests + git (fast, deterministic)
2. convo_scanner (new) — parse .claude/projects/-slug dirnames
3. regex detector — prose fallback (fast, noisy)
4. llm_refine (if --llm) — sample corpus, classify candidates, merge
LLM step specifically:
- Take top N regex-detected candidates (with their current type + confidence).
- For each, collect 3–5 sample context lines from the source corpus.
- Send as one batch per 30–50 candidates to stay within small-model context.
- Request structured JSON output:
{name, type ∈ {PERSON, PROJECT, TOPIC, COMMON_WORD, AMBIGUOUS}, reason}.
- Merge: reclassify, drop COMMON_WORD, flag AMBIGUOUS for user review.
Interactive UX
- Progress line during refinement:
refining candidates: 12/50 (current: NAME_REDACTED).
- Ctrl-C behavior: interrupt the current batch, accept results from completed batches, skip the rest, proceed to
confirm_entities with whatever was classified.
- If provider/model unavailable: fail fast with a clear error, do NOT silently fall back to regex-only (user asked for LLM, respect that).
- If LLM disagrees with regex classification, show both in the confirm step so the user can adjudicate.
Scale approach (sampling)
- For code: skip the LLM entirely. Manifests are authoritative.
- For prose / conversations: sample stratified by recency (newer first) and source weight (user-authored content > machine-generated). Target budget: ~50–100K tokens total input across all batches.
- On a 4B local model via Ollama, expected wall time: roughly 1–3 minutes for a large prose corpus. Acceptable for one-time init.
New modules (proposed layout)
mempalace/llm_client.py — provider abstraction (ollama, openai-compat, anthropic)
mempalace/llm_refine.py — refinement pass, prompt, JSON-mode parsing, cancellation handling
mempalace/convo_scanner.py — deterministic parse of .claude/projects/ dirnames
- CLI wiring in
mempalace/cli.py + config keys in mempalace/config.py
Test plan
- Unit tests per provider with mocked HTTP.
- Integration test against
gemma3:4b via local Ollama (skippable if Ollama not running).
- Interface-generalization test against one
openai-compat endpoint.
- Cancellation test (SIGINT during refinement → partial results returned cleanly).
- Golden-fixture tests on synthetic corpora representing the hard cases: diary prose, chat transcripts, mixed-language notes.
Explicitly out of scope (for this issue)
- Embedding-based semantic search (exists in
searcher.py, separate concern).
- Knowledge-graph entity linking across palace wings.
- Training / fine-tuning a custom classifier.
- Any change to the existing regex detector beyond what v1 shipped.
Acceptance criteria
mempalace init --llm runs end-to-end with a local Ollama model and produces classified output.
- Same command works against an
openai-compat endpoint with only flag changes.
- Ctrl-C during refinement returns partial results and reaches
confirm_entities.
- Default
mempalace init (no flag) remains zero-API and byte-identical to v1 behavior.
- Documentation updated with provider setup instructions.
Context
PR #1148 landed v1 of init entity detection: deterministic signal from package manifests (
package.json,pyproject.toml,Cargo.toml,go.mod) + git authors, with an improved regex fallback for prose. That path is strong for codebases. It's still limited for prose-only corpora (notes, diaries, chat transcripts) where a short capitalized token can be a person, a place, a fictional character, or a common English word — regex alone cannot disambiguate.This issue tracks phase 2: an opt-in LLM refinement step that takes the candidate set produced by phase 1 and reclassifies it with semantic understanding.
Goals
mempalace initwith a visible progress indicator and clean cancellation (Ctrl-C → return partial results and proceed).~/.claude/projects/-slugdirectory names as an extra project-name source (no LLM required, but useful for conversation-heavy users).CLI shape
Persist defaults via
mempalace config set llm.provider ollama/llm.model ....MEMPALACE_LLM_*env vars override.Architecture
New
discover_entities(dir, llm=None)layering:LLM step specifically:
{name, type ∈ {PERSON, PROJECT, TOPIC, COMMON_WORD, AMBIGUOUS}, reason}.Interactive UX
refining candidates: 12/50 (current: NAME_REDACTED).confirm_entitieswith whatever was classified.Scale approach (sampling)
New modules (proposed layout)
mempalace/llm_client.py— provider abstraction (ollama, openai-compat, anthropic)mempalace/llm_refine.py— refinement pass, prompt, JSON-mode parsing, cancellation handlingmempalace/convo_scanner.py— deterministic parse of.claude/projects/dirnamesmempalace/cli.py+ config keys inmempalace/config.pyTest plan
gemma3:4bvia local Ollama (skippable if Ollama not running).openai-compatendpoint.Explicitly out of scope (for this issue)
searcher.py, separate concern).Acceptance criteria
mempalace init --llmruns end-to-end with a local Ollama model and produces classified output.openai-compatendpoint with only flag changes.confirm_entities.mempalace init(no flag) remains zero-API and byte-identical to v1 behavior.