Skip to content

Commit 336816e

Browse files
committed
chore: update CLAUDE.md with new CLI commands
1 parent 27006f5 commit 336816e

1 file changed

Lines changed: 52 additions & 31 deletions

File tree

CLAUDE.md

Lines changed: 52 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -19,33 +19,53 @@ source .envrc
1919

2020
### Running the Application
2121

22+
**balatrollm** - Main bot CLI:
23+
2224
```
23-
usage: balatrollm [-h] [-m MODEL] [-l] [-s STRATEGY] [-u BASE_URL] [-k API_KEY] [-c CONFIG]
24-
[-d RUNS_DIR] [-r RUNS] [-p PORT] [--no-screenshot] [--use-default-paths]
25-
{benchmark} ...
25+
usage: balatrollm [-h] [-m MODEL] [-l] [-s STRATEGY] [-u BASE_URL]
26+
[-k API_KEY] [-d RUNS_DIR] [-r RUNS_PER_SEED]
27+
[--seeds SEEDS] [-p PORTS] [--no-screenshot]
28+
[--use-default-paths]
2629
2730
LLM-powered Balatro bot
2831
29-
positional arguments:
30-
{benchmark} Available commands
31-
benchmark Analyze runs and generate leaderboards
32-
3332
options:
3433
-h, --help show this help message and exit
35-
-m, --model MODEL Model name to use from OpenRouter (default: openai/gpt-oss-20b)
36-
-l, --list-models List available models from OpenRouter and exit
34+
-m, --model MODEL Model name to use from OpenAI-compatible API (required, uses BALATROLLM_MODEL env var if set)
35+
-l, --list-models List available models from OpenAI-compatible API and exit
3736
-s, --strategy STRATEGY
3837
Name of the strategy to use (default: default)
3938
-u, --base-url BASE_URL
40-
OpenAI-compatible API base URL (default: https://openrouter.ai/api/v1)
39+
OpenAI-compatible API base URL (required, uses BALATROLLM_BASE_URL env var if set)
4140
-k, --api-key API_KEY
42-
API key (default: OPENROUTER_API_KEY env var)
41+
API key (default: BALATROLLM_API_KEY env var)
4342
-d, --runs-dir RUNS_DIR
4443
Base directory for storing run data (default: current directory)
45-
-r, --runs RUNS Number of times to run the bot with the same configuration (default: 1)
46-
-p, --port PORT Port for BalatroBot client connection (can specify multiple, default: 12346)
47-
--no-screenshot Disable taking screenshots during gameplay (use when running Balatro in headless mode)
48-
--use-default-paths Use BalatroBot's default storage paths for screenshots and game logs (use when balatrobot and balatrollm are running on different systems)
44+
-r, --runs-per-seed RUNS_PER_SEED
45+
Number of runs per seed (default: 1)
46+
--seeds SEEDS Comma-separated list of seeds (e.g., AAAA123,BBBB456,CCCC789)
47+
-p, --ports PORTS Comma-separated list of ports for BalatroBot client connections (default: 12346, e.g., 12346,12347,12348)
48+
--no-screenshot Disable taking screenshots during gameplay
49+
--use-default-paths Use BalatroBot's default storage paths for screenshots and game logs
50+
```
51+
52+
**balatrobench** - Benchmark analysis CLI:
53+
54+
```
55+
usage: balatrobench [-h] (--models | --strategies) [--input-dir INPUT_DIR]
56+
[--output-dir OUTPUT_DIR] [--avif]
57+
58+
Analyze BalatroLLM runs and generate benchmark leaderboards
59+
60+
options:
61+
-h, --help show this help message and exit
62+
--models Analyze by models (compare models within strategies)
63+
--strategies Analyze by strategies (compare strategies for each model)
64+
--input-dir INPUT_DIR
65+
Input directory with run data (default: runs/v{version})
66+
--output-dir OUTPUT_DIR
67+
Output directory for benchmark results (default: benchmarks/[models|strategies]/v{version})
68+
--avif Convert PNG screenshots to AVIF format after analysis
4969
```
5070

5171
### Development
@@ -55,20 +75,13 @@ BalatroLLM Development Makefile
5575
5676
Available targets:
5777
help Show this help message
58-
install Install package dependencies
59-
install-dev Install package with development dependencies
60-
lint Run ruff linter (check only)
61-
lint-fix Run ruff linter with auto-fixes
78+
install Install dependencies
79+
lint Run ruff linter with auto-fixes
6280
format Run ruff formatter
6381
typecheck Run type checker
6482
quality Run all code quality checks
65-
test Run tests
66-
test-cov Run tests with coverage report
67-
all Run all code quality checks and tests
68-
clean Clean build artifacts and caches
69-
setup Kill previous instances and start Balatro
83+
setup Kill previous instances and start Balatro (INSTANCES=1)
7084
teardown Stop Balatro processes
71-
balatrobench Run benchmark for all models and generate analysis
7285
```
7386

7487
### Game Automation
@@ -190,15 +203,23 @@ tests/test_llm.py # Test suite
190203

191204
**Run Data Structure:**
192205

193-
- `runs/[version]/[strategy]/[vendor]/[model-name]/[timestamp]_[deck]_[seed]/`
194-
- JSONL format for performance analysis across vendors, models, and strategies
206+
- `runs/v{version}/{strategy}/{vendor}/{model}/{timestamp}_{deck}_s{stake}_{seed}/`
207+
- Each run directory contains: config.json, strategy.json, stats.json, gamestates.jsonl, requests.jsonl, responses.jsonl, run.log, screenshots/
195208
- Strategy-first organization enables easy comparison across vendors/models within strategies
196209

197-
**Benchmark Results Structure:**
210+
**Benchmark Results Structure (Dual Modes):**
211+
212+
Benchmarks can be generated in two modes using the `balatrobench` CLI:
213+
214+
**By Models Mode** (`--models`):
215+
- `benchmarks/models/v{version}/{strategy}/leaderboard.json` - Models ranked within each strategy
216+
- `benchmarks/models/v{version}/{vendor}/{model}/{strategy}/stats.json` - Detailed model stats per strategy
217+
- Compare different models within each strategy
198218

199-
- `benchmarks/[version]/[strategy]/[vendor]/[model-name].json` - Detailed model analysis
200-
- `benchmarks/[version]/[strategy]/leaderboard.json` - Strategy-specific leaderboard
201-
- Hierarchical structure matches runs organization for consistency
219+
**By Strategies Mode** (`--strategies`):
220+
- `benchmarks/strategies/v{version}/{strategy}/leaderboard.json` - Models ranked within strategy
221+
- `benchmarks/strategies/v{version}/{vendor}/{model}.json` - Model stats across strategies
222+
- Compare different strategies for each model
202223

203224
## Strategy System
204225

0 commit comments

Comments
 (0)