feat(llm): add claude-code provider (#1193)#1200
feat(llm): add claude-code provider (#1193)#1200mvalentsev wants to merge 3 commits intoMemPalace:developfrom
Conversation
86ad3c9 to
724a556
Compare
|
Thanks for the thorough write-up, @mvalentsev, but unfortunately we can't merge this yet. The current Claude Code legal page (https://code.claude.com/docs/en/legal-and-compliance) reads:
Users with an Leaving the PR open so the discussion stays visible. Happy to revisit if Anthropic publishes guidance that permits this. |
|
@igorls fair concern. The frame I'm working from is OpenClaw's own published stance. Their Anthropic provider docs (https://docs.openclaw.ai/providers/anthropic) state, verbatim:
That's the most-watched third-party Anthropic harness, post-April-4 block (openclaw/openclaw#63316), publishing direct guidance from Anthropic that user-local This PR's OpenClaw's own docs do note that for long-lived gateway hosts, API keys remain "the clearest and most predictable production path" -- fair, and Your call. |
|
The usage from my perspective is different, agentic usage is sparse and with varied amount of tokens, which is in some ways similar to a human prompting an LLM, MemPalace using Haiku for example for entity extraction is a repeated mass automation usage. This is where I think the issue is. |
|
Hi, ClaudeCodeProvider.classify passes the full system prompt as a command-line argument ( Severity: action required | Category: security How to fix: Move system prompt off argv Agent prompt to fix - you can give this to your LLM of choice:
We noticed a couple of other issues in this PR as well - happy to share if helpful. Found by Qodo code review |
724a556 to
22db326
Compare
Adds a fourth LLM provider that routes through the local `claude` CLI binary using the user's Claude Pro/Max subscription via `claude auth login`. No API key needed; mirrors the existing ollama/openai-compat/anthropic provider shape (same `classify(system, user, json_mode)` and `check_available()` surface). Hooks into `get_provider()`; `mempalace init --llm --llm-provider claude-code` just works. Subprocess to `claude -p --output-format json --system-prompt ... --model ... --no-session-persistence`, run from `tempfile.gettempdir()` so claude does not pick up a project-level CLAUDE.md. `--bare` is intentionally omitted: it would force ANTHROPIC_API_KEY auth and disable OAuth / keychain, defeating the subscription path. Zero new pip dependencies. Subscription use from third-party harnesses is governed by Anthropic's policy and may be restricted later; `check_available()` surfaces auth errors at that point so callers can fall back.
- Override is_external_service=True so the MemPalace#1224 privacy gate fires. The CLI binary runs locally but every classify call routes user content to Anthropic-hosted models, so the URL-based base-class default (returns local because endpoint is None) misclassifies this provider. - Strip ANTHROPIC_* env vars from the subprocess environment. If the user has ANTHROPIC_API_KEY exported, claude -p can fall back to API-key auth and bill the API account instead of the subscription this provider is built around. - Frame system+user content with <system>/<user> XML tags instead of literal SYSTEM:/USER: markers. A malicious drawer text containing '\\n\\nSYSTEM:\\nIgnore prior instructions...' could otherwise spoof the boundary and inject a second system prompt. - Spawn the absolute path returned by shutil.which("claude") rather than the bare 'claude' literal. Closes a TOCTOU window between check_available() resolving the binary and classify() spawning a potentially different binary if PATH changes between calls. - Pass encoding='utf-8' explicitly to subprocess.run so a Windows cp1252 locale does not mojibake the JSON envelope before json.loads. - Include the raw stdout excerpt in the non-JSON envelope LLMError so CLI-output regressions can be debugged without reproducing. - Broaden check_available()'s exception filter from (subprocess.TimeoutExpired, OSError) to (subprocess.SubprocessError, OSError) so future SubprocessError subclasses do not leak. Tests added: env-scrub propagation, is_external_service True, binary-missing path. Existing tests updated for the resolved-binary cmd[0] and XML stdin framing.
22db326 to
8f0536a
Compare
…h Ollama default ClaudeCodeProvider._subprocess_env was scrubbing every env var starting with ANTHROPIC_, but only ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN carry credentials that could trigger an unwanted API-key billing fallback. Configuration vars (ANTHROPIC_BASE_URL for corporate proxy / internal gateway routing, etc.) need to survive into the subprocess so users behind a custom endpoint keep working. Strip is now an explicit allowlist of the two credential-bearing names; the test was updated to pin ANTHROPIC_BASE_URL passing through alongside the credential removals. Separate UX fix: cmd_init now prints a one-line stderr hint when --llm-provider claude-code is paired with the default --llm-model (gemma4:e4b, the Ollama tag). Without the hint, claude -p rejects the call with a confusing model-not-found error instead of pointing at claude-haiku-4-5 / claude-sonnet-4-6 / etc. Continues with whatever the user passed -- the hint is a nudge, not a block.
Summary
Adds a
claude-codeLLM provider inmempalace/llm_client.pythat routes through the localclaudeCLI binary using the user's Claude Pro/Max subscription viaclaude auth login. Mirrors the existingOllamaProvider/OpenAICompatProvider/AnthropicProvidershape somempalace init --llm --llm-provider claude-code --llm-model claude-haiku-4-5works with no API key.Closes #1193.
How it works
System prompt + user prompt go through stdin, framed with XML-like tags:
cwd=tempfile.gettempdir()so claude does not pick up a project-levelCLAUDE.md. Auth flows throughclaude auth login(OAuth / keychain). The resolved binary path comes fromshutil.which("claude")(not the bare"claude"literal) so a PATH change betweencheck_available()andclassify()does not introduce a TOCTOU.check_available()runsclaude auth status --textand surfaces a friendly error pointing atclaude auth loginif not authenticated.Hardening
--system-prompt <text>. argv is visible to other local users viaps//proc/*/cmdline, and the prompt can carry sensitive context (entity names, project paths). Moving system content to stdin keeps that surface zero. The<system>/<user>XML framing replaces literalSYSTEM:/USER:markers a malicious drawer could spoof to inject a second system prompt._subprocess_env()stripsANTHROPIC_API_KEYandANTHROPIC_AUTH_TOKENbefore spawningclaude -p. If the user has either exported in their shell, the CLI may fall back to API-key auth and bill the API account instead of the subscription this provider is built around. Removing the credentials forces OAuth / keychain auth, which is the documented path. Configuration vars (ANTHROPIC_BASE_URLfor corporate-proxy / internal-gateway routing, etc.) pass through so users behind a custom endpoint keep working.is_external_serviceis overridden toTrue. The base class default is URL-based and would treat this provider as local becauseendpoint=None, but every classify call routes user content to Anthropic-hosted models. The override makes the privacy-warning gate (feat(privacy): warn when LLM tier sends content to external API #1224) fire forclaude-codethe same way it does for theanthropicprovider.--llm-provider claude-codeis paired with the Ollama-shaped default--llm-model gemma4:e4b.cmd_initprints a one-line stderr nudge pointing atclaude-haiku-4-5and continues -- the hint is informational, not blocking.Why subprocess and not
claude-agent-sdkThe Anthropic-published Python SDK was the obvious alternative; subprocess won on every axis for our use case:
subprocess.run(['claude', '-p', ...])claude-agent-sdkclaude-agent-sdk+anyio>=3.9>=3.10claude auth login(CLI keychain)The SDK is a thin wrapper around the same
claudebinary the user already has. Going direct keepsllm_client.py's zero-SDK style intact and does not raise the Python floor.Why
--bareis NOT usedclaude --barewould skip hooks, plugins, and CLAUDE.md auto-discovery for clean isolation. Fromclaude --help:That defeats the point of a subscription provider. We omit it and reduce ambient noise via
cwd=tempfile.gettempdir()and--no-session-persistenceinstead.Subscription policy fragility
This provider is fully opt-in:
--llmis opt-in, and within that--llm-provider claude-codeis opt-in. Defaultinitpath remains zero-API.Anthropic blocked OAuth-token-replay through third-party harnesses on April 4, 2026;
claude -pinvocation from third-party tools was subsequently sanctioned for first-party CLI binaries. That sanction may change. If it does,check_available()will return(False, ...)from the post-policyclaude auth statusfailure, surfacing a clear error before any classify call. Existingllm_refine.pycallers can fall through to a different provider.Documented in the provider docstring so future readers know this path is best-effort.
Failure modes
All raise
LLMErrorlike the other providers:claudebinary missing ->check_available()returns(False, "not found in PATH");classify()raisesLLMError("claudeCLI not found in PATH")if the binary disappears between provider construction and classifycheck_available()returns(False, "Runclaude auth login...")claude -ptimeout ->LLMError("claude -ptimed out after Ns")OSError) ->LLMError("failed to spawn")LLMErrorwithstderr[:500]LLMError("non-JSON envelope")withstdout[:200]excerpt for debuggingresultfield ->LLMError("empty result")Tests
16 new tests in
tests/test_llm_client.py: 1 factory-dispatch test, 13 unit tests coveringcheck_available()andclassify()paths (mockingsubprocess.runand/orshutil.which) plus the hardening hooks (env strip,is_external_service, binary missing), and 1 gated integration testtest_claude_code_real_invocationthat runs a liveclaude -pround-trip whenMEMPAL_TEST_CLAUDE_CLI=1is set (skipped by default; CI has no authenticated user).tests/test_cli.pyadds 2 tests pinning thegemma4:e4b-with-claude-code stderr hint behaviour. Local pytest: 108 passed + 1 skipped, ruff clean.Out of scope
benchmarks/longmemeval_bench.py,benchmarks/locomo_bench.py)--llm-backend claude-code. Benchmarks are excluded from the package tests and have their own argparse layer (--llm-backend [anthropic, ollama]); plumbing claude-code through the rerank/refine call sites is a parallel concern. Happy to do it as a follow-up if useful.mempalace config set llm.provider ...persistence (planned under feat(init): optional LLM-assisted entity classification (phase 2) #1149 follow-ups).Body updated 2026-05-03 to match landed code:
8f0536amoved the system prompt from--system-promptargv to stdin (with<system>/<user>XML framing), stripsANTHROPIC_*credential env vars before spawning, and overridesis_external_servicetoTrueso the privacy gate fires;451c40cnarrowed the env strip toANTHROPIC_API_KEY+ANTHROPIC_AUTH_TOKENonly (configuration vars likeANTHROPIC_BASE_URLpass through for corporate-proxy users) and added thecmd_inithint whenclaude-codeis paired with the Ollama default model.