11<div align =" center " >
22 <h1 >BalatroBench</h1 >
33 <p align =" center " >
4+ <a href="https://github.com/coder/balatrobench/releases">
5+ <img alt="GitHub release" src="https://img.shields.io/github/v/release/coder/balatrobench?include_prereleases&sort=semver&style=for-the-badge&logo=github"/>
6+ </a>
47 <a href="https://discord.gg/TPn6FYgGPv">
58 <img alt="Discord" src="https://img.shields.io/badge/discord-server?style=for-the-badge&logo=discord&logoColor=%23FFFFFF&color=%235865F2"/>
69 </a>
710 </p >
8- <div >
9- <img width="1024" alt="Screenshot: BalatroBench" src="https://github.com/user-attachments/assets/33a52df0-a7f8-4784-a640-0212267ed199" />
10- </div >
11- <br >
12- <p ><em >Benchmark LLMs' strategic performance in Balatro</em ></p >
11+ <div ><img src =" ./site/assets/balatrobench.svg " alt =" balatrobench " width =" 170 " height =" 170 " ></div >
12+ <p ><em >Benchmark LLMs playing Balatro</em ></p >
1313</div >
1414
1515---
1616
17- ## Quick Start
17+ BalatroBench is a benchmark analysis tool and leaderboard for [ BalatroLLM] ( https://github.com/coder/balatrollm ) runs. It processes game data and generates interactive leaderboards comparing LLM models and strategies playing [ Balatro] ( https://www.playbalatro.com/ ) .
18+
19+ ## 📚 Documentation
20+
21+ #### Requirements
1822
19- To run this project locally:
23+ - [ uv] ( https://docs.astral.sh/uv/ ) - Python package manager
24+ - [ npm] ( https://www.npmjs.com/ ) - Node.js package manager
25+
26+ #### Installation
27+
28+ Install Python and npm dependencies:
2029
2130``` bash
22- # Serve from site/ directory
23- python3 -m http.server 8000 --directory site
31+ make install
32+ source .venv/bin/activate
2433```
2534
26- Then visit [ http://localhost:8000 ] ( http://localhost:8000 )
35+ This runs ` uv sync ` for Python packages and ` npm install ` for Playwright tests.
36+
37+ For browser binaries (first time only):
38+
39+ ``` bash
40+ npx playwright install
41+ ```
2742
28- ### Local Data
43+ #### Generating Benchmarks
2944
3045Generate benchmark data from BalatroLLM runs:
3146
3247``` bash
33- # Generate benchmark data (requires ../balatrollm/runs symlinked or present)
34- uv run balatrobench --input-dir runs/v1.0.0
35- ```
48+ # Default: reads from ../balatrollm/runs/v1.0.0, writes to site/benchmarks
49+ uv run balatrobench
3650
37- Then in ` site/config.js ` , set ` environment: 'development' ` to use local data.
51+ # Custom paths
52+ uv run balatrobench --input-dir /path/to/runs/v1.0.0 --output-dir /path/to/output
3853
39- ## Testing
54+ # Enable WebP conversion for screenshots
55+ uv run balatrobench --webp
56+ ```
4057
41- The project includes end-to-end tests using Playwright.
58+ #### Starting the Website
4259
43- ### Running Tests
60+ Serve the site locally:
4461
4562``` bash
46- # Install dependencies (first time only)
47- npm install
48- npx playwright install # Install browser binaries
63+ npm run serve
64+ # or
65+ python3 -m http.server 8000 --directory site
66+ ```
67+
68+ Then visit [ http://localhost:8000 ] ( http://localhost:8000 )
69+
70+ To use local benchmark data, set ` environment: 'development' ` in ` site/config.js ` .
4971
72+ #### Running Tests
73+
74+ End-to-end tests use Playwright:
75+
76+ ``` bash
5077# Run tests headless (default)
5178npm test
5279
@@ -62,10 +89,8 @@ npm run test:debug
6289
6390The test server is automatically started by Playwright (see ` playwright.config.js ` ).
6491
65- ## Upload to CDN
92+ ## 🚀 Related Projects
6693
67- This project make use of BunnyCDN for hosting static assets in benchmarks directory. If you have access to the CDN, you can upload the data with
68-
69- ```
70- uv run upload.py
71- ```
94+ - [ ** BalatroBot** ] ( https://github.com/coder/balatrobot ) : API for developing Balatro bots
95+ - [ ** BalatroLLM** ] ( https://github.com/coder/balatrollm ) : Play Balatro with LLMs
96+ - [ ** BalatroBench** ] ( https://github.com/coder/balatrobench ) : Benchmark LLMs playing Balatro
0 commit comments