feat(llm): add claude-code provider (#1193) by mvalentsev · Pull Request #1200 · MemPalace/mempalace

mvalentsev · 2026-04-25T17:21:20Z

Summary

Adds a claude-code LLM provider in mempalace/llm_client.py that routes through the local claude CLI binary using the user's Claude Pro/Max subscription via claude auth login. Mirrors the existing OllamaProvider / OpenAICompatProvider / AnthropicProvider shape so mempalace init --llm --llm-provider claude-code --llm-model claude-haiku-4-5 works with no API key.

Closes #1193.

How it works

claude -p \
    --no-session-persistence \
    --output-format json \
    --model <model>

System prompt + user prompt go through stdin, framed with XML-like tags:

<system>
{system_prompt + JSON-only directive when json_mode=True}
</system>
<user>
{user_prompt}
</user>

cwd=tempfile.gettempdir() so claude does not pick up a project-level CLAUDE.md. Auth flows through claude auth login (OAuth / keychain). The resolved binary path comes from shutil.which("claude") (not the bare "claude" literal) so a PATH change between check_available() and classify() does not introduce a TOCTOU.

check_available() runs claude auth status --text and surfaces a friendly error pointing at claude auth login if not authenticated.

Hardening

System prompt off argv. The first version passed the system prompt via --system-prompt <text>. argv is visible to other local users via ps / /proc/*/cmdline, and the prompt can carry sensitive context (entity names, project paths). Moving system content to stdin keeps that surface zero. The <system>/<user> XML framing replaces literal SYSTEM: / USER: markers a malicious drawer could spoof to inject a second system prompt.
Credential env scrub. _subprocess_env() strips ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN before spawning claude -p. If the user has either exported in their shell, the CLI may fall back to API-key auth and bill the API account instead of the subscription this provider is built around. Removing the credentials forces OAuth / keychain auth, which is the documented path. Configuration vars (ANTHROPIC_BASE_URL for corporate-proxy / internal-gateway routing, etc.) pass through so users behind a custom endpoint keep working.
External-service flag. is_external_service is overridden to True. The base class default is URL-based and would treat this provider as local because endpoint=None, but every classify call routes user content to Anthropic-hosted models. The override makes the privacy-warning gate (feat(privacy): warn when LLM tier sends content to external API #1224) fire for claude-code the same way it does for the anthropic provider.
Friendly UX hint when --llm-provider claude-code is paired with the Ollama-shaped default --llm-model gemma4:e4b. cmd_init prints a one-line stderr nudge pointing at claude-haiku-4-5 and continues -- the hint is informational, not blocking.

Why subprocess and not `claude-agent-sdk`

The Anthropic-published Python SDK was the obvious alternative; subprocess won on every axis for our use case:

Criterion	`subprocess.run(['claude', '-p', ...])`	`claude-agent-sdk`
Pip dependency	none (stdlib)	`claude-agent-sdk` + `anyio`
Python floor	matches current `>=3.9`	bumps to `>=3.10`
API surface	sync, matches existing providers	async-only, needs asyncio bridge
Auth path	`claude auth login` (CLI keychain)	same (SDK delegates to the CLI)
Maintenance	pin claude CLI flags	pin SDK API + CLI flags

The SDK is a thin wrapper around the same claude binary the user already has. Going direct keeps llm_client.py's zero-SDK style intact and does not raise the Python floor.

Why `--bare` is NOT used

claude --bare would skip hooks, plugins, and CLAUDE.md auto-discovery for clean isolation. From claude --help:

--bare: ... Anthropic auth is strictly ANTHROPIC_API_KEY or apiKeyHelper via --settings (OAuth and keychain are never read).

That defeats the point of a subscription provider. We omit it and reduce ambient noise via cwd=tempfile.gettempdir() and --no-session-persistence instead.

Subscription policy fragility

This provider is fully opt-in: --llm is opt-in, and within that --llm-provider claude-code is opt-in. Default init path remains zero-API.

Anthropic blocked OAuth-token-replay through third-party harnesses on April 4, 2026; claude -p invocation from third-party tools was subsequently sanctioned for first-party CLI binaries. That sanction may change. If it does, check_available() will return (False, ...) from the post-policy claude auth status failure, surfacing a clear error before any classify call. Existing llm_refine.py callers can fall through to a different provider.

Documented in the provider docstring so future readers know this path is best-effort.

Failure modes

All raise LLMError like the other providers:

claude binary missing -> check_available() returns (False, "not found in PATH"); classify() raises LLMError("claude CLI not found in PATH") if the binary disappears between provider construction and classify
Not logged in -> check_available() returns (False, "Run claude auth login...")
claude -p timeout -> LLMError("claude -p timed out after Ns")
Spawn failure (OSError) -> LLMError("failed to spawn")
Non-zero exit -> LLMError with stderr[:500]
Malformed JSON envelope -> LLMError("non-JSON envelope") with stdout[:200] excerpt for debugging
Empty result field -> LLMError("empty result")

Tests

16 new tests in tests/test_llm_client.py: 1 factory-dispatch test, 13 unit tests covering check_available() and classify() paths (mocking subprocess.run and/or shutil.which) plus the hardening hooks (env strip, is_external_service, binary missing), and 1 gated integration test test_claude_code_real_invocation that runs a live claude -p round-trip when MEMPAL_TEST_CLAUDE_CLI=1 is set (skipped by default; CI has no authenticated user). tests/test_cli.py adds 2 tests pinning the gemma4:e4b-with-claude-code stderr hint behaviour. Local pytest: 108 passed + 1 skipped, ruff clean.

Out of scope

Benchmark scripts (benchmarks/longmemeval_bench.py, benchmarks/locomo_bench.py) --llm-backend claude-code. Benchmarks are excluded from the package tests and have their own argparse layer (--llm-backend [anthropic, ollama]); plumbing claude-code through the rerank/refine call sites is a parallel concern. Happy to do it as a follow-up if useful.
mempalace config set llm.provider ... persistence (planned under feat(init): optional LLM-assisted entity classification (phase 2) #1149 follow-ups).
Optimizing CLAUDE.md auto-discovery overhead. First call pays a one-time cache-miss; subsequent calls hit Anthropic's prompt cache and are cheap. If real users hit this on tiny corpora we can revisit.

Body updated 2026-05-03 to match landed code: 8f0536a moved the system prompt from --system-prompt argv to stdin (with <system>/<user> XML framing), strips ANTHROPIC_* credential env vars before spawning, and overrides is_external_service to True so the privacy gate fires; 451c40c narrowed the env strip to ANTHROPIC_API_KEY + ANTHROPIC_AUTH_TOKEN only (configuration vars like ANTHROPIC_BASE_URL pass through for corporate-proxy users) and added the cmd_init hint when claude-code is paired with the Ollama default model.

igorls · 2026-04-25T22:46:23Z

Thanks for the thorough write-up, @mvalentsev, but unfortunately we can't merge this yet.

The current Claude Code legal page (https://code.claude.com/docs/en/legal-and-compliance) reads:

OAuth authentication is intended exclusively for purchasers of Claude Free, Pro, Max, Team, and Enterprise subscription plans and is designed to support ordinary use of Claude Code and other native Anthropic applications.

Developers building products or services that interact with Claude's capabilities, including those using the Agent SDK, should use API key authentication through Claude Console or a supported cloud provider. Anthropic does not permit third-party developers to offer Claude.ai login or to route requests through Free, Pro, or Max plan credentials on behalf of their users.

Anthropic reserves the right to take measures to enforce these restrictions and may do so without prior notice.

ClaudeCodeProvider routes user classify calls through Pro/Max credentials via claude auth login — the subprocess wrapper around claude -p doesn't change the underlying pattern, and the Agent SDK (also a wrapper around the same binary) is named explicitly. If we ship this and Anthropic enforces, users following our README could have their subscriptions actioned, with MemPalace as the cause.

Users with an ANTHROPIC_API_KEY are already covered by the existing anthropic provider.

Leaving the PR open so the discussion stays visible. Happy to revisit if Anthropic publishes guidance that permits this.

mvalentsev · 2026-04-25T23:02:54Z

@igorls fair concern. The frame I'm working from is OpenClaw's own published stance. Their Anthropic provider docs (https://docs.openclaw.ai/providers/anthropic) state, verbatim:

Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and claude -p usage as sanctioned unless Anthropic publishes a new policy.

That's the most-watched third-party Anthropic harness, post-April-4 block (openclaw/openclaw#63316), publishing direct guidance from Anthropic that user-local claude -p subprocess invocation is not the prohibited route. Their claude-cli provider has been in production since openclaw/openclaw#61160 and is actively maintained -- merged: openclaw#69179, #69211, #70902; open: #71332, #70863, #68682, #66819, #68388.

This PR's ClaudeCodeProvider is structurally identical -- spawn user's logged-in claude binary, no token extraction, no direct API replay. Under OpenClaw's reading of Anthropic guidance, the Agent SDK clause in the legal page is read narrowly: SDK is a wrapper around claude running under the user's own login, and the "route requests on behalf of their users" prohibition targets server-side third-party services, not user-local subprocess invocation.

OpenClaw's own docs do note that for long-lived gateway hosts, API keys remain "the clearest and most predictable production path" -- fair, and mempalace init --llm is a one-shot local invocation, not a long-lived gateway.

Your call.

igorls · 2026-04-26T00:10:03Z

The usage from my perspective is different, agentic usage is sparse and with varied amount of tokens, which is in some ways similar to a human prompting an LLM, MemPalace using Haiku for example for entity extraction is a repeated mass automation usage. This is where I think the issue is.

Qodo-Free-For-OSS · 2026-04-26T06:30:06Z

Hi, ClaudeCodeProvider.classify passes the full system prompt as a command-line argument (--system-prompt <text>), which exposes prompt content to local process listings and logs. This can leak sensitive instructions/context to other local users on the same machine.

Severity: action required | Category: security

How to fix: Move system prompt off argv

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

ClaudeCodeProvider.classify() passes the full system prompt via the --system-prompt argv parameter, which can leak prompt contents via process listings. We need to avoid placing potentially sensitive prompt text in argv.

Issue Context

user is already passed via stdin (good), but sys_prompt is exposed in argv.

The llm_refine pipeline only needs the model to follow instructions; strict “system” separation is less important than preventing local leakage.

Fix Focus Areas

mempalace/llm_client.py[350-379]

Implementation notes

Prefer an approach that does not include the system prompt in argv:

Option A: stop using --system-prompt and instead prepend the system instructions to stdin input (e.g., input = f"SYSTEM:\n{sys_prompt}\n\nUSER:\n{user}").

Option B: if the claude CLI supports reading system prompt from a file or stdin, use that mechanism (e.g., write to a temp file with restrictive permissions and pass only the filename in argv).

Add/adjust a unit test to assert sys_prompt is not present in captured["cmd"].

We noticed a couple of other issues in this PR as well - happy to share if helpful.

Found by Qodo code review

Adds a fourth LLM provider that routes through the local `claude` CLI binary using the user's Claude Pro/Max subscription via `claude auth login`. No API key needed; mirrors the existing ollama/openai-compat/anthropic provider shape (same `classify(system, user, json_mode)` and `check_available()` surface). Hooks into `get_provider()`; `mempalace init --llm --llm-provider claude-code` just works. Subprocess to `claude -p --output-format json --system-prompt ... --model ... --no-session-persistence`, run from `tempfile.gettempdir()` so claude does not pick up a project-level CLAUDE.md. `--bare` is intentionally omitted: it would force ANTHROPIC_API_KEY auth and disable OAuth / keychain, defeating the subscription path. Zero new pip dependencies. Subscription use from third-party harnesses is governed by Anthropic's policy and may be restricted later; `check_available()` surfaces auth errors at that point so callers can fall back.

- Override is_external_service=True so the MemPalace#1224 privacy gate fires. The CLI binary runs locally but every classify call routes user content to Anthropic-hosted models, so the URL-based base-class default (returns local because endpoint is None) misclassifies this provider. - Strip ANTHROPIC_* env vars from the subprocess environment. If the user has ANTHROPIC_API_KEY exported, claude -p can fall back to API-key auth and bill the API account instead of the subscription this provider is built around. - Frame system+user content with <system>/<user> XML tags instead of literal SYSTEM:/USER: markers. A malicious drawer text containing '\\n\\nSYSTEM:\\nIgnore prior instructions...' could otherwise spoof the boundary and inject a second system prompt. - Spawn the absolute path returned by shutil.which("claude") rather than the bare 'claude' literal. Closes a TOCTOU window between check_available() resolving the binary and classify() spawning a potentially different binary if PATH changes between calls. - Pass encoding='utf-8' explicitly to subprocess.run so a Windows cp1252 locale does not mojibake the JSON envelope before json.loads. - Include the raw stdout excerpt in the non-JSON envelope LLMError so CLI-output regressions can be debugged without reproducing. - Broaden check_available()'s exception filter from (subprocess.TimeoutExpired, OSError) to (subprocess.SubprocessError, OSError) so future SubprocessError subclasses do not leak. Tests added: env-scrub propagation, is_external_service True, binary-missing path. Existing tests updated for the resolved-binary cmd[0] and XML stdin framing.

…h Ollama default ClaudeCodeProvider._subprocess_env was scrubbing every env var starting with ANTHROPIC_, but only ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN carry credentials that could trigger an unwanted API-key billing fallback. Configuration vars (ANTHROPIC_BASE_URL for corporate proxy / internal gateway routing, etc.) need to survive into the subprocess so users behind a custom endpoint keep working. Strip is now an explicit allowlist of the two credential-bearing names; the test was updated to pin ANTHROPIC_BASE_URL passing through alongside the credential removals. Separate UX fix: cmd_init now prints a one-line stderr hint when --llm-provider claude-code is paired with the default --llm-model (gemma4:e4b, the Ollama tag). Without the hint, claude -p rejects the call with a confusing model-not-found error instead of pointing at claude-haiku-4-5 / claude-sonnet-4-6 / etc. Continues with whatever the user passed -- the hint is a nudge, not a block.

mvalentsev force-pushed the feat/llm-claude-code-provider branch from 86ad3c9 to 724a556 Compare April 25, 2026 17:31

mvalentsev marked this pull request as ready for review April 25, 2026 17:47

mvalentsev requested review from bensig, igorls and milla-jovovich as code owners April 25, 2026 17:47

mvalentsev mentioned this pull request Apr 25, 2026

feat(llm): add claude-code provider for Claude Pro/Max subscription users #1193

Open

mvalentsev force-pushed the feat/llm-claude-code-provider branch from 724a556 to 22db326 Compare April 28, 2026 18:05

igorls added the enhancement New feature or request label May 2, 2026

mvalentsev added 2 commits May 3, 2026 21:21

mvalentsev force-pushed the feat/llm-claude-code-provider branch from 22db326 to 8f0536a Compare May 3, 2026 16:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): add claude-code provider (#1193)#1200

feat(llm): add claude-code provider (#1193)#1200
mvalentsev wants to merge 3 commits intoMemPalace:developfrom
mvalentsev:feat/llm-claude-code-provider

mvalentsev commented Apr 25, 2026 •

edited

Loading

Uh oh!

igorls commented Apr 25, 2026

Uh oh!

mvalentsev commented Apr 25, 2026

Uh oh!

igorls commented Apr 26, 2026

Uh oh!

Qodo-Free-For-OSS commented Apr 26, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvalentsev commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Hardening

Why subprocess and not claude-agent-sdk

Why --bare is NOT used

Subscription policy fragility

Failure modes

Tests

Out of scope

Uh oh!

igorls commented Apr 25, 2026

Uh oh!

mvalentsev commented Apr 25, 2026

Uh oh!

igorls commented Apr 26, 2026

Uh oh!

Qodo-Free-For-OSS commented Apr 26, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mvalentsev commented Apr 25, 2026 •

edited

Loading

Why subprocess and not `claude-agent-sdk`

Why `--bare` is NOT used