1+ <!DOCTYPE html>
2+ < html lang ="en ">
3+ < head >
4+ < meta charset ="UTF-8 ">
5+ < meta name ="viewport " content ="width=device-width, initial-scale=1.0 ">
6+ < title > About BalatroBench - LLM Benchmark Platform</ title >
7+ < meta name ="description " content ="Learn about BalatroBench, the AI benchmark platform that evaluates Large Language Models through strategic Balatro gameplay. ">
8+ < script src ="https://cdn.tailwindcss.com "> </ script >
9+ < script defer src ="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js "> </ script >
10+ < link rel ="stylesheet " href ="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css ">
11+ </ head >
12+ < body class ="bg-gray-900 text-white min-h-screen ">
13+ <!-- Navigation -->
14+ < nav class ="bg-gray-800 border-b border-gray-700 ">
15+ < div class ="container mx-auto px-4 py-4 ">
16+ < div class ="flex items-center justify-between ">
17+ < div class ="flex items-center space-x-6 ">
18+ < h1 class ="text-xl font-bold text-blue-400 "> BalatroBench</ h1 >
19+ < div class ="hidden md:flex space-x-4 ">
20+ < a href ="index.html " class ="text-gray-300 hover:text-white "> Leaderboard</ a >
21+ < a href ="community.html " class ="text-gray-300 hover:text-white "> Community</ a >
22+ < a href ="about.html " class ="text-blue-400 hover:text-blue-300 "> About</ a >
23+ </ div >
24+ </ div >
25+ </ div >
26+ </ div >
27+ </ nav >
28+
29+ < div class ="container mx-auto px-4 py-8 max-w-4xl ">
30+ <!-- Hero Section -->
31+ < div class ="mb-12 ">
32+ < h1 class ="text-4xl font-bold bg-gradient-to-r from-blue-400 to-purple-400 bg-clip-text text-transparent mb-6 ">
33+ About BalatroBench
34+ </ h1 >
35+ < p class ="text-gray-300 text-lg leading-relaxed ">
36+ BalatroBench is a cutting-edge benchmark platform that evaluates Large Language Models through strategic gameplay in Balatro,
37+ a roguelike poker deck-building game. By testing AI decision-making in a complex, dynamic environment, we measure models'
38+ strategic thinking, tool usage, and adaptive problem-solving capabilities.
39+ </ p >
40+ </ div >
41+
42+ <!-- What is Balatro Section -->
43+ < div class ="bg-gray-800 rounded-lg border border-gray-700 mb-8 ">
44+ < div class ="p-6 border-b border-gray-700 ">
45+ < h2 class ="text-2xl font-bold text-white flex items-center ">
46+ < i class ="fas fa-gamepad text-purple-400 mr-3 "> </ i >
47+ What is Balatro?
48+ </ h2 >
49+ </ div >
50+ < div class ="p-6 ">
51+ < p class ="text-gray-300 mb-4 ">
52+ Balatro is a sophisticated roguelike poker deck-building game where players defeat "blinds" by playing poker hands to score chips.
53+ The game combines traditional poker mechanics with strategic deck building, resource management, and synergistic card effects.
54+ </ p >
55+ < div class ="grid md:grid-cols-2 gap-6 ">
56+ < div >
57+ < h3 class ="font-semibold text-white mb-3 "> Core Mechanics</ h3 >
58+ < ul class ="space-y-2 text-sm text-gray-300 ">
59+ < li > • < strong > Scoring Formula:</ strong > Chips × Multiplier = Final Score</ li >
60+ < li > • < strong > Ante Structure:</ strong > Sets of 3 blinds (Small → Big → Boss)</ li >
61+ < li > • < strong > Limited Resources:</ strong > Hands and discards per round</ li >
62+ < li > • < strong > Progressive Difficulty:</ strong > Each ante increases challenge</ li >
63+ </ ul >
64+ </ div >
65+ < div >
66+ < h3 class ="font-semibold text-white mb-3 "> Strategic Elements</ h3 >
67+ < ul class ="space-y-2 text-sm text-gray-300 ">
68+ < li > • < strong > Joker Synergies:</ strong > Cards that modify scoring</ li >
69+ < li > • < strong > Money Management:</ strong > Interest system and economics</ li >
70+ < li > • < strong > Hand Type Focus:</ strong > Specializing in specific poker hands</ li >
71+ < li > • < strong > Boss Adaptation:</ strong > Preparing for unique challenges</ li >
72+ </ ul >
73+ </ div >
74+ </ div >
75+ </ div >
76+ </ div >
77+
78+ <!-- How It Works Section -->
79+ < div class ="bg-gray-800 rounded-lg border border-gray-700 mb-8 ">
80+ < div class ="p-6 border-b border-gray-700 ">
81+ < h2 class ="text-2xl font-bold text-white flex items-center ">
82+ < i class ="fas fa-cogs text-blue-400 mr-3 "> </ i >
83+ How BalatroBench Works
84+ </ h2 >
85+ </ div >
86+ < div class ="p-6 ">
87+ < div class ="grid gap-6 ">
88+ < div class ="bg-gray-700 rounded-lg p-4 ">
89+ < h3 class ="font-semibold text-white mb-2 flex items-center ">
90+ < span class ="bg-blue-500 w-8 h-8 rounded-full flex items-center justify-center text-sm mr-3 "> 1</ span >
91+ AI Agent Integration
92+ </ h3 >
93+ < p class ="text-gray-300 text-sm ">
94+ Language models are integrated with < strong > BalatroLLM</ strong > , an intelligent bot that interfaces with the game through
95+ strategic decision-making tools. The AI receives game state information and must make optimal choices.
96+ </ p >
97+ </ div >
98+ < div class ="bg-gray-700 rounded-lg p-4 ">
99+ < h3 class ="font-semibold text-white mb-2 flex items-center ">
100+ < span class ="bg-purple-500 w-8 h-8 rounded-full flex items-center justify-center text-sm mr-3 "> 2</ span >
101+ Strategic Decision Making
102+ </ h3 >
103+ < p class ="text-gray-300 text-sm ">
104+ Models must balance multiple strategic priorities: money management, joker synergies, hand type focus,
105+ boss blind preparation, and resource optimization. Each decision impacts long-term success.
106+ </ p >
107+ </ div >
108+ < div class ="bg-gray-700 rounded-lg p-4 ">
109+ < h3 class ="font-semibold text-white mb-2 flex items-center ">
110+ < span class ="bg-green-500 w-8 h-8 rounded-full flex items-center justify-center text-sm mr-3 "> 3</ span >
111+ Performance Evaluation
112+ </ h3 >
113+ < p class ="text-gray-300 text-sm ">
114+ Results are measured across consistent seed sets, evaluating average ante reached, win rates,
115+ token efficiency, and strategic decision quality over multiple runs.
116+ </ p >
117+ </ div >
118+ </ div >
119+ </ div >
120+ </ div >
121+
122+ <!-- Methodology Section -->
123+ < div class ="bg-gray-800 rounded-lg border border-gray-700 mb-8 ">
124+ < div class ="p-6 border-b border-gray-700 ">
125+ < h2 class ="text-2xl font-bold text-white flex items-center ">
126+ < i class ="fas fa-chart-line text-green-400 mr-3 "> </ i >
127+ Evaluation Methodology
128+ </ h2 >
129+ </ div >
130+ < div class ="p-6 ">
131+ < div class ="grid md:grid-cols-2 gap-6 ">
132+ < div >
133+ < h3 class ="font-semibold text-white mb-3 "> Benchmark Parameters</ h3 >
134+ < ul class ="space-y-2 text-sm text-gray-300 ">
135+ < li > • < strong > Game Version:</ strong > Balatro v1.0.1n</ li >
136+ < li > • < strong > Consistency:</ strong > 100 predefined seeds</ li >
137+ < li > • < strong > Configuration:</ strong > Standard deck, no modifications</ li >
138+ < li > • < strong > Reproducibility:</ strong > Deterministic testing environment</ li >
139+ </ ul >
140+ </ div >
141+ < div >
142+ < h3 class ="font-semibold text-white mb-3 "> Evaluation Metrics</ h3 >
143+ < ul class ="space-y-2 text-sm text-gray-300 ">
144+ < li > • < strong > Average Ante Reached:</ strong > Progression depth</ li >
145+ < li > • < strong > Win Rate:</ strong > Success across seed variations</ li >
146+ < li > • < strong > Token Efficiency:</ strong > Resource usage optimization</ li >
147+ < li > • < strong > Decision Quality:</ strong > Strategic choice evaluation</ li >
148+ </ ul >
149+ </ div >
150+ </ div >
151+ </ div >
152+ </ div >
153+
154+ <!-- Why This Matters Section -->
155+ < div class ="bg-gray-800 rounded-lg border border-gray-700 mb-8 ">
156+ < div class ="p-6 border-b border-gray-700 ">
157+ < h2 class ="text-2xl font-bold text-white flex items-center ">
158+ < i class ="fas fa-lightbulb text-yellow-400 mr-3 "> </ i >
159+ Why This Benchmark Matters
160+ </ h2 >
161+ </ div >
162+ < div class ="p-6 ">
163+ < div class ="grid gap-4 text-gray-300 ">
164+ < div >
165+ < h3 class ="font-semibold text-white mb-2 "> Beyond Traditional Benchmarks</ h3 >
166+ < p class ="text-sm ">
167+ While most AI benchmarks focus on static question-answering or pattern recognition, BalatroBench evaluates
168+ dynamic decision-making in complex, multi-step scenarios that require strategic thinking and adaptation.
169+ </ p >
170+ </ div >
171+ < div >
172+ < h3 class ="font-semibold text-white mb-2 "> Real-World Problem Solving</ h3 >
173+ < p class ="text-sm ">
174+ The skills tested—resource management, long-term planning, risk assessment, and synergistic thinking—
175+ directly translate to real-world applications in business strategy, project management, and complex problem solving.
176+ </ p >
177+ </ div >
178+ < div >
179+ < h3 class ="font-semibold text-white mb-2 "> Tool-Using AI Assessment</ h3 >
180+ < p class ="text-sm ">
181+ BalatroBench specifically evaluates models' ability to use tools effectively, make strategic decisions with incomplete
182+ information, and optimize for long-term success rather than immediate rewards.
183+ </ p >
184+ </ div >
185+ </ div >
186+ </ div >
187+ </ div >
188+
189+ <!-- Technical Details Section -->
190+ < div class ="bg-gray-800 rounded-lg border border-gray-700 mb-8 ">
191+ < div class ="p-6 border-b border-gray-700 ">
192+ < h2 class ="text-2xl font-bold text-white flex items-center ">
193+ < i class ="fas fa-code text-blue-400 mr-3 "> </ i >
194+ Technical Implementation
195+ </ h2 >
196+ </ div >
197+ < div class ="p-6 ">
198+ < p class ="text-gray-300 mb-4 ">
199+ BalatroBench is powered by < strong > BalatroLLM</ strong > , an open-source Python framework that integrates
200+ Large Language Models with Balatro through the BalatroBot interface.
201+ </ p >
202+ < div class ="bg-gray-700 rounded-lg p-4 ">
203+ < h3 class ="font-semibold text-white mb-3 "> Key Components</ h3 >
204+ < ul class ="space-y-2 text-sm text-gray-300 ">
205+ < li > • < strong > BalatroBot:</ strong > Game interface and automation layer</ li >
206+ < li > • < strong > LLM Integration:</ strong > OpenAI-compatible API with strategic prompting</ li >
207+ < li > • < strong > Template System:</ strong > Jinja2-based prompt engineering</ li >
208+ < li > • < strong > Evaluation Pipeline:</ strong > Automated testing and result aggregation</ li >
209+ </ ul >
210+ </ div >
211+ < div class ="mt-4 ">
212+ < a href ="https://github.com/S1M0N38/balatrollm " target ="_blank "
213+ class ="inline-flex items-center text-blue-400 hover:text-blue-300 transition-colors ">
214+ < i class ="fab fa-github mr-2 "> </ i >
215+ View on GitHub
216+ < i class ="fas fa-external-link-alt ml-1 text-xs "> </ i >
217+ </ a >
218+ </ div >
219+ </ div >
220+ </ div >
221+
222+ <!-- Get Involved Section -->
223+ < div class ="bg-gradient-to-r from-blue-500/10 to-purple-500/10 rounded-lg border border-blue-500/20 p-6 ">
224+ < h2 class ="text-2xl font-bold text-white mb-4 flex items-center ">
225+ < i class ="fas fa-users text-purple-400 mr-3 "> </ i >
226+ Get Involved
227+ </ h2 >
228+ < p class ="text-gray-300 mb-6 ">
229+ BalatroBench is a community-driven platform. Whether you're an AI researcher, developer, or strategic gaming enthusiast,
230+ there are multiple ways to contribute to advancing LLM evaluation.
231+ </ p >
232+ < div class ="flex flex-wrap gap-4 ">
233+ < a href ="https://github.com/S1M0N38/balatrobench/blob/main/CONTRIBUTING.md " target ="_blank "
234+ class ="bg-gradient-to-r from-blue-500 to-purple-600 hover:from-blue-600 hover:to-purple-700 text-white px-6 py-3 rounded-lg font-medium transition-all duration-200 shadow-lg ">
235+ < i class ="fas fa-upload mr-2 "> </ i > Submit Your Strategy
236+ </ a >
237+ < a href ="community.html "
238+ class ="border border-gray-600 text-gray-300 hover:border-gray-500 hover:text-white px-6 py-3 rounded-lg font-medium transition-colors ">
239+ < i class ="fas fa-comments mr-2 "> </ i > Join Community
240+ </ a >
241+ < a href ="https://github.com/S1M0N38/balatrollm " target ="_blank "
242+ class ="border border-gray-600 text-gray-300 hover:border-gray-500 hover:text-white px-6 py-3 rounded-lg font-medium transition-colors ">
243+ < i class ="fab fa-github mr-2 "> </ i > Contribute Code
244+ </ a >
245+ </ div >
246+ </ div >
247+ </ div >
248+
249+ < script src ="js/app.js "> </ script >
250+ </ body >
251+ </ html >
0 commit comments