Use your Claude Pro/Max subscription programmatically.
Proxies API calls through the Claude Code CLI binary — the only reliable way to use Claude Sonnet and Opus with OAuth subscription tokens from Python.
from claude_cli_auth import ClaudeCliAuth
client = ClaudeCliAuth()
print(client.query("What is 2+2?", model="claude-sonnet-4-6"))If you have a Claude Pro or Max subscription ($20-200/month), you'd expect to use those models from Python. You can't — at least not directly:
| Approach | Sonnet/Opus | Haiku | Why |
|---|---|---|---|
Python anthropic SDK with OAuth token |
429 | Works | Server rejects Python clients for premium models |
| Agent SDK with OAuth token | 401 | Works | Agent SDK doesn't support OAuth tokens at all |
| Raw HTTP with Bearer auth | 429 | Works | Same server-side rejection as Python SDK |
Console API key (ANTHROPIC_API_KEY) |
Works | Works | But costs $$$ per token (separate billing) |
| CLI binary proxy (this package) | Works | Works | Binary IS Claude Code, so OAuth works natively |
We tested 15 different approaches before finding this solution. The Claude Code CLI binary is a compiled Bun/JavaScript application that uses the JavaScript Anthropic SDK — there's something in the connection-level handling (HTTP/2, TLS sessions, server-side session tracking) that the Python SDK cannot replicate.
- Claude Code desktop app installed from claude.ai/code
- Logged in — run
claude /loginin your terminal - Active subscription — Claude Pro ($20/mo) or Max ($100-200/mo)
- macOS — Keychain-based token storage (Linux support is experimental)
pip install claude-cli-authOr from source:
git clone https://github.com/Chris0x88/claude-cli-auth.git
cd claude-cli-auth
pip install -e .from claude_cli_auth import ClaudeCliAuth
client = ClaudeCliAuth()
# Check your setup
info = client.check()
print(info)
# {'cli_found': True, 'cli_version': '2.1.87', 'token_found': True, ...}
# Query any model
response = client.query("Explain quantum computing in one sentence.")
print(response)response = client.query(
"What patterns do you see?",
model="claude-sonnet-4-6",
system="You are a senior data analyst.",
history=[
{"role": "user", "content": "I uploaded sales data"},
{"role": "assistant", "content": "I see Q1 revenue trends..."},
],
)Since the CLI doesn't support native function calling, tools are injected as text. The model outputs tool calls in a parseable format:
import re
tools = [
{"name": "get_price", "description": "Get stock price", "parameters": {"symbol": "string"}},
]
response = client.query("What's AAPL trading at?", tools=tools)
# Parse tool calls: [TOOL: get_price {"symbol": "AAPL"}]
TOOL_RE = re.compile(r'\[\s*TOOL:\s*(\w+)\s*(\{[^}]*\})?\s*\]')
for match in TOOL_RE.finditer(response):
print(f"Call: {match.group(1)}({match.group(2)})")Your Python App
|
v
ClaudeCliAuth.query()
|
| subprocess.run()
v
Claude Code CLI Binary
(~/Library/Application Support/Claude/claude-code/*/claude.app/.../claude)
|
| JavaScript Anthropic SDK (HTTP/2, OAuth, session management)
v
Anthropic API
|
v
Claude Sonnet / Opus / Haiku
The CLI binary is the exact same code that powers Claude Code desktop. It handles:
- OAuth token authentication natively
- Automatic token refresh
- Correct HTTP/2 connection handling
- Server-side session tracking
- All the beta headers and feature flags
We just call it as a subprocess with --output-format json and parse the result.
The package also exposes lower-level token utilities:
from claude_cli_auth import get_token, is_session_token, build_headers, refresh_token
# Read token from Keychain (auto-refreshes if expired)
token = get_token()
# Check token type
print(is_session_token(token)) # True for OAuth, False for API key
# Build headers for raw HTTP requests (works for Haiku)
headers = build_headers(token)| Step | What Happens |
|---|---|
| Login | claude /login stores OAuth credentials in macOS Keychain |
| Read | get_token() reads from Keychain entry Claude Code-credentials |
| Expiry | Access tokens expire after ~8 hours |
| Refresh | Automatic via refresh_token() using the refresh_token grant |
| Writeback | Refreshed tokens written back to Keychain (keeps Claude Code in sync) |
| Failure | If refresh fails, get_token() returns None (never returns expired tokens) |
| Model | Recommended Path | Notes |
|---|---|---|
| Sonnet 4.6 | CLI proxy | OAuth works, no per-token cost |
| Opus 4.6 | CLI proxy | OAuth works, no per-token cost |
| Haiku 4.5 | Python SDK direct | Works fine with OAuth, supports streaming |
| Free models | OpenRouter / other | Separate API, user-controlled |
For Haiku, you can use the anthropic Python SDK directly with the token from get_token(). Only Sonnet/Opus need the CLI proxy.
Be aware of these before using in production:
- No streaming — The CLI returns the complete response. No real-time token updates. Typical response time: 5-15 seconds depending on prompt length.
- ~3-5s overhead — Process spawn + CLI bootstrap adds latency on top of model inference time.
- macOS only — Token storage relies on macOS Keychain. Linux support is experimental (token must be provided via environment variable).
- Version-dependent CLI path —
_find_claude_cli()handles this dynamically, but a Claude Code update could temporarily break things if the binary path format changes. - No native tool calling — Tools are injected as text in the prompt. The model outputs
[TOOL: name {args}]which you parse yourself. Works well but not as reliable as native function calling. - Single-turn by default — Each
query()call is independent. Conversation history must be passed explicitly. - Subscription required — You need an active Claude Pro ($20/mo) or Max subscription.
For posterity — every approach we tested before landing on the CLI proxy:
- Raw
requests.post()with Bearer auth — 429 - Raw
requests.post()with every beta combination — 429 httpxwith HTTP/2 — ImportError (h2 not installed), then 429curlwith identical headers — 429- Official
anthropicPython SDK withauth_token=— 429 - SDK with
max_retries=5— still 429 after all retries - SDK with metadata (device_id, account_uuid) — 429
- SDK with
X-Claude-Code-Session-Idheader — 429 - Fresh token from OAuth refresh endpoint — 429
- Token from Keychain (same as Claude Code uses) — 429
- After killing all 15 other Claude Code sessions — 429
- After 60-second cooldown between calls — 429
- With
thinkingconfig enabled — 429 - With
temperature: 1— 429 - Claude Agent SDK with
CLAUDE_CODE_OAUTH_TOKEN— 401
The 429 errors have no rate-limit headers (unlike normal rate limits). This suggests server-side client fingerprinting that distinguishes Claude Code's JavaScript SDK from Python HTTP clients.
Why not just use an API key?
Console API keys (sk-ant-api03-*) work for all models but use pay-per-token billing — completely separate from your subscription. Heavy usage of Sonnet/Opus through the API can cost hundreds of dollars per month on top of your subscription fee.
Will this break when Claude Code updates?
find_cli() dynamically searches for the newest CLI version, so it handles version upgrades automatically. If Anthropic changes the binary path format entirely, we'll need an update.
Is this against Anthropic's ToS? Anthropic's policy (as of April 2026) restricts OAuth tokens to Claude Code and Claude.ai. This package uses the Claude Code CLI binary itself — it doesn't extract the token for use elsewhere. The API calls go through Claude Code's own auth stack. That said, use at your own discretion and check current terms.
Why does Haiku work but Sonnet/Opus don't? We believe Anthropic applies different rate limiting tiers server-side. Haiku (free tier) has relaxed limits, while Sonnet/Opus enforce stricter client validation that only the JavaScript SDK passes.
Can I use this in CI/CD?
Yes, if the runner has Claude Code installed and a cached Keychain credential. For headless environments, you'll need to provide the token via ANTHROPIC_API_KEY and accept that Sonnet/Opus won't work via the Python SDK path (Haiku only).
ClaudeCliAuth(timeout=90, cli_path=None)| Method | Description |
|---|---|
query(prompt, model, system, history, tools, timeout) |
Send a prompt, get a response |
check() |
Check CLI + token availability |
| Function | Description |
|---|---|
get_token() |
Read token from Keychain with auto-refresh |
refresh_token(refresh_tok) |
Manually refresh an OAuth token |
force_refresh() |
Force-refresh ignoring expiry |
is_session_token(key) |
Check if key is OAuth vs API key |
build_headers(token) |
Build HTTP headers for raw requests |
get_betas() |
Get required beta header values |
get_cache_control() |
Get cache control config for subscribers |
| Exception | When |
|---|---|
ClaudeCliNotFound |
CLI binary not installed |
TokenNotFound |
No valid token anywhere |
TokenExpired |
Token expired, refresh failed |
AuthError |
API returned 401 |
CliExecutionError |
CLI returned an error |
CliTimeoutError |
CLI didn't respond in time |
PRs welcome, especially for:
- Linux/Windows token storage support
- Streaming support (if the CLI adds it)
- Better tool calling patterns
MIT