You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CLAUDE.md
+44-17Lines changed: 44 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,29 +4,41 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
4
4
5
5
## Project Overview
6
6
7
-
BalatroBench is a static web application that displays performance leaderboards for LLMs playing the card game Balatro. It's a frontend-only project without build tools - the site uses vanilla JavaScript with Tailwind CSS loaded from CDN.
7
+
BalatroBench is a static web application that displays performance leaderboards for LLMs playing the card game Balatro. It's a frontend-only project without build tools - the site uses vanilla JavaScript with Tailwind CSS and Chart.js loaded from CDN.
8
8
9
9
## Architecture
10
10
11
11
### Core Components
12
12
13
13
-**index.html**: Main leaderboard page with responsive table layout using Tailwind CSS
14
-
-**script.js**: Fetches and renders leaderboard data from JSON files in the data directory
15
-
-**data/**: Contains benchmark results organized by version and strategy
16
-
-`data/benchmarks/v0.8.0/default/leaderboard.json`: Primary leaderboard data
14
+
-**script.js**: Fetches and renders leaderboard data, with interactive expandable rows showing detailed charts and statistics
15
+
-**data/**: Contains benchmark results organized by version, strategy, and data type
16
+
-`data/benchmarks/v0.8.0/default/leaderboard.json`: Primary model leaderboard data
17
+
-`data/community/v0.8.0/default/leaderboard.json`: Community strategy leaderboard data
17
18
- Individual model result files in vendor subdirectories (e.g., `openai/gpt-oss-120b.json`)
18
19
19
20
### Data Structure
20
21
21
22
The leaderboard displays AI model performance with metrics including:
22
23
- Final round reached (with standard deviation)
23
24
- Success/failure/error rates for API calls
24
-
- Token usage (input/output)
25
-
- Execution time and cost per game
26
-
- Multiple provider usage statistics
25
+
- Token usage (input/output with standard deviations)
26
+
- Execution time and cost per game (with standard deviations)
27
+
- Provider usage distribution
28
+
- Detailed per-game statistics and histograms
27
29
28
30
Models are identified by `vendor/model` format and ranked by performance metrics.
29
31
32
+
### Interactive Features
33
+
34
+
-**Expandable Rows**: Click on desktop (lg+) to expand detailed view with:
35
+
- Round distribution histogram using Chart.js
36
+
- Provider usage pie chart
37
+
- Complete per-game statistics table
38
+
- Total aggregated metrics (tokens, costs, time)
39
+
-**Responsive Design**: Columns hide/show based on screen size
40
+
-**Dual Display Modes**: Support for both model-based and community strategy leaderboards
41
+
30
42
## Development Commands
31
43
32
44
### Local Development
@@ -38,6 +50,12 @@ python3 -m http.server 8000
38
50
# Then visit http://localhost:8000
39
51
```
40
52
53
+
### Dependencies
54
+
55
+
-**Tailwind CSS**: Styling framework loaded from CDN
56
+
-**Chart.js**: Charting library for histograms and pie charts
57
+
-**Heroicons**: Icon library (included but minimal usage in current implementation)
58
+
41
59
### File Structure Conventions
42
60
43
61
- All files use UTF-8 encoding with LF line endings
0 commit comments