Skip to content

feat(init): optional LLM-assisted entity classification (phase 2) #1149

@igorls

Description

@igorls

Context

PR #1148 landed v1 of init entity detection: deterministic signal from package manifests (package.json, pyproject.toml, Cargo.toml, go.mod) + git authors, with an improved regex fallback for prose. That path is strong for codebases. It's still limited for prose-only corpora (notes, diaries, chat transcripts) where a short capitalized token can be a person, a place, a fictional character, or a common English word — regex alone cannot disambiguate.

This issue tracks phase 2: an opt-in LLM refinement step that takes the candidate set produced by phase 1 and reclassifies it with semantic understanding.

Goals

  • Ship as opt-in only — default remains zero-API / local-first.
  • Default provider is local (Ollama), so the feature can work fully offline on a consumer machine.
  • Pluggable provider interface so a user with an API key can swap in a hosted model.
  • Run interactively during mempalace init with a visible progress indicator and clean cancellation (Ctrl-C → return partial results and proceed).
  • Don't feed raw corpora to the model; feed regex-detected candidates + sampled context.
  • Separately: deterministic parsing of ~/.claude/projects/-slug directory names as an extra project-name source (no LLM required, but useful for conversation-heavy users).

CLI shape

mempalace init <dir>                                        # unchanged, no LLM
mempalace init <dir> --llm                                  # opt-in, use configured default
mempalace init <dir> --llm-provider ollama --llm-model gemma3:4b
mempalace init <dir> --llm-provider openai-compat \
                     --llm-endpoint http://localhost:1234/v1 \
                     --llm-model <any>
mempalace init <dir> --llm-provider anthropic --llm-model claude-haiku-4-5

Persist defaults via mempalace config set llm.provider ollama / llm.model .... MEMPALACE_LLM_* env vars override.

Architecture

New discover_entities(dir, llm=None) layering:

1. project_scanner         — manifests + git (fast, deterministic)
2. convo_scanner (new)     — parse .claude/projects/-slug dirnames
3. regex detector          — prose fallback (fast, noisy)
4. llm_refine (if --llm)   — sample corpus, classify candidates, merge

LLM step specifically:

  • Take top N regex-detected candidates (with their current type + confidence).
  • For each, collect 3–5 sample context lines from the source corpus.
  • Send as one batch per 30–50 candidates to stay within small-model context.
  • Request structured JSON output: {name, type ∈ {PERSON, PROJECT, TOPIC, COMMON_WORD, AMBIGUOUS}, reason}.
  • Merge: reclassify, drop COMMON_WORD, flag AMBIGUOUS for user review.

Interactive UX

  • Progress line during refinement: refining candidates: 12/50 (current: NAME_REDACTED).
  • Ctrl-C behavior: interrupt the current batch, accept results from completed batches, skip the rest, proceed to confirm_entities with whatever was classified.
  • If provider/model unavailable: fail fast with a clear error, do NOT silently fall back to regex-only (user asked for LLM, respect that).
  • If LLM disagrees with regex classification, show both in the confirm step so the user can adjudicate.

Scale approach (sampling)

  • For code: skip the LLM entirely. Manifests are authoritative.
  • For prose / conversations: sample stratified by recency (newer first) and source weight (user-authored content > machine-generated). Target budget: ~50–100K tokens total input across all batches.
  • On a 4B local model via Ollama, expected wall time: roughly 1–3 minutes for a large prose corpus. Acceptable for one-time init.

New modules (proposed layout)

  • mempalace/llm_client.py — provider abstraction (ollama, openai-compat, anthropic)
  • mempalace/llm_refine.py — refinement pass, prompt, JSON-mode parsing, cancellation handling
  • mempalace/convo_scanner.py — deterministic parse of .claude/projects/ dirnames
  • CLI wiring in mempalace/cli.py + config keys in mempalace/config.py

Test plan

  • Unit tests per provider with mocked HTTP.
  • Integration test against gemma3:4b via local Ollama (skippable if Ollama not running).
  • Interface-generalization test against one openai-compat endpoint.
  • Cancellation test (SIGINT during refinement → partial results returned cleanly).
  • Golden-fixture tests on synthetic corpora representing the hard cases: diary prose, chat transcripts, mixed-language notes.

Explicitly out of scope (for this issue)

  • Embedding-based semantic search (exists in searcher.py, separate concern).
  • Knowledge-graph entity linking across palace wings.
  • Training / fine-tuning a custom classifier.
  • Any change to the existing regex detector beyond what v1 shipped.

Acceptance criteria

  • mempalace init --llm runs end-to-end with a local Ollama model and produces classified output.
  • Same command works against an openai-compat endpoint with only flag changes.
  • Ctrl-C during refinement returns partial results and reaches confirm_entities.
  • Default mempalace init (no flag) remains zero-API and byte-identical to v1 behavior.
  • Documentation updated with provider setup instructions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions