feat(init): optional LLM-assisted entity classification (phase 2)

## Context

PR #1148 landed v1 of init entity detection: deterministic signal from package manifests (`package.json`, `pyproject.toml`, `Cargo.toml`, `go.mod`) + git authors, with an improved regex fallback for prose. That path is strong for codebases. It's still limited for prose-only corpora (notes, diaries, chat transcripts) where a short capitalized token can be a person, a place, a fictional character, or a common English word — regex alone cannot disambiguate.

This issue tracks phase 2: an **opt-in** LLM refinement step that takes the candidate set produced by phase 1 and reclassifies it with semantic understanding.

## Goals

- Ship as **opt-in only** — default remains zero-API / local-first.
- Default provider is **local** (Ollama), so the feature can work fully offline on a consumer machine.
- Pluggable provider interface so a user with an API key can swap in a hosted model.
- Run **interactively** during `mempalace init` with a visible progress indicator and clean cancellation (Ctrl-C → return partial results and proceed).
- Don't feed raw corpora to the model; feed regex-detected candidates + sampled context.
- Separately: deterministic parsing of `~/.claude/projects/-slug` directory names as an extra project-name source (no LLM required, but useful for conversation-heavy users).

## CLI shape

```
mempalace init <dir>                                        # unchanged, no LLM
mempalace init <dir> --llm                                  # opt-in, use configured default
mempalace init <dir> --llm-provider ollama --llm-model gemma3:4b
mempalace init <dir> --llm-provider openai-compat \
                     --llm-endpoint http://localhost:1234/v1 \
                     --llm-model <any>
mempalace init <dir> --llm-provider anthropic --llm-model claude-haiku-4-5
```

Persist defaults via `mempalace config set llm.provider ollama` / `llm.model ...`. `MEMPALACE_LLM_*` env vars override.

## Architecture

New `discover_entities(dir, llm=None)` layering:

```
1. project_scanner         — manifests + git (fast, deterministic)
2. convo_scanner (new)     — parse .claude/projects/-slug dirnames
3. regex detector          — prose fallback (fast, noisy)
4. llm_refine (if --llm)   — sample corpus, classify candidates, merge
```

LLM step specifically:
- Take top N regex-detected candidates (with their current type + confidence).
- For each, collect 3–5 sample context lines from the source corpus.
- Send as one batch per 30–50 candidates to stay within small-model context.
- Request structured JSON output: `{name, type ∈ {PERSON, PROJECT, TOPIC, COMMON_WORD, AMBIGUOUS}, reason}`.
- Merge: reclassify, drop COMMON_WORD, flag AMBIGUOUS for user review.

## Interactive UX

- Progress line during refinement: `refining candidates: 12/50 (current: NAME_REDACTED)`.
- Ctrl-C behavior: interrupt the current batch, accept results from completed batches, skip the rest, proceed to `confirm_entities` with whatever was classified.
- If provider/model unavailable: fail fast with a clear error, do NOT silently fall back to regex-only (user asked for LLM, respect that).
- If LLM disagrees with regex classification, show both in the confirm step so the user can adjudicate.

## Scale approach (sampling)

- For code: skip the LLM entirely. Manifests are authoritative.
- For prose / conversations: sample stratified by recency (newer first) and source weight (user-authored content > machine-generated). Target budget: ~50–100K tokens total input across all batches.
- On a 4B local model via Ollama, expected wall time: roughly 1–3 minutes for a large prose corpus. Acceptable for one-time init.

## New modules (proposed layout)

- `mempalace/llm_client.py` — provider abstraction (ollama, openai-compat, anthropic)
- `mempalace/llm_refine.py` — refinement pass, prompt, JSON-mode parsing, cancellation handling
- `mempalace/convo_scanner.py` — deterministic parse of `.claude/projects/` dirnames
- CLI wiring in `mempalace/cli.py` + config keys in `mempalace/config.py`

## Test plan

- Unit tests per provider with mocked HTTP.
- Integration test against `gemma3:4b` via local Ollama (skippable if Ollama not running).
- Interface-generalization test against one `openai-compat` endpoint.
- Cancellation test (SIGINT during refinement → partial results returned cleanly).
- Golden-fixture tests on synthetic corpora representing the hard cases: diary prose, chat transcripts, mixed-language notes.

## Explicitly out of scope (for this issue)

- Embedding-based semantic search (exists in `searcher.py`, separate concern).
- Knowledge-graph entity linking across palace wings.
- Training / fine-tuning a custom classifier.
- Any change to the existing regex detector beyond what v1 shipped.

## Acceptance criteria

- `mempalace init --llm` runs end-to-end with a local Ollama model and produces classified output.
- Same command works against an `openai-compat` endpoint with only flag changes.
- Ctrl-C during refinement returns partial results and reaches `confirm_entities`.
- Default `mempalace init` (no flag) remains zero-API and byte-identical to v1 behavior.
- Documentation updated with provider setup instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(init): optional LLM-assisted entity classification (phase 2) #1149

Context

Goals

CLI shape

Architecture

Interactive UX

Scale approach (sampling)

New modules (proposed layout)

Test plan

Explicitly out of scope (for this issue)

Acceptance criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(init): optional LLM-assisted entity classification (phase 2) #1149

Description

Context

Goals

CLI shape

Architecture

Interactive UX

Scale approach (sampling)

New modules (proposed layout)

Test plan

Explicitly out of scope (for this issue)

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions