docs: add first version of the docs (WIP)

S1M0N38 · S1M0N38 · commit 74aa38f94372 · 2025-09-22T14:01:55.000+02:00
diff --git a/docs/analysis.md b/docs/analysis.md
@@ -0,0 +1,204 @@
+# Analysis
+
+Generate comprehensive benchmarks, analyze performance metrics, and integrate with BalatroBench for detailed statistics visualization.
+
+## Benchmark Generation
+
+### Basic Benchmarking
+
+Run benchmarks to evaluate model performance:
+
+```bash
+# Benchmark current configuration
+balatrollm benchmark
+
+# Benchmark specific model across strategies
+balatrollm --model openai/gpt-oss-120b benchmark
+
+# Benchmark with multiple runs for statistical significance
+balatrollm --runs 20 benchmark
+```
+
+### Comprehensive Benchmarking
+
+Generate benchmarks across multiple dimensions:
+
+```bash
+# Benchmark all models with default strategy
+make balatrobench
+
+# Benchmark specific strategy across models
+balatrollm --strategy aggressive --runs 15 benchmark
+
+# Benchmark multiple strategies and models
+for strategy in default aggressive; do
+  for model in openai/gpt-oss-20b openai/gpt-oss-120b qwen/qwen3-235b-a22b-2507; do
+    balatrollm --strategy $strategy --model $model --runs 10 benchmark
+  done
+done
+```
+
+## Benchmark Results
+
+### Result Structure
+
+Benchmarks are organized hierarchically:
+
+```
+benchmarks/
+├── v0.10.0/                    # Version
+│   ├── default/                # Strategy
+│   │   ├── openrouter/         # Provider
+│   │   │   ├── gpt-oss-20b.json
+│   │   │   └── gpt-oss-120b.json
+│   │   └── leaderboard.json    # Strategy summary
+│   └── aggressive/
+│       ├── openrouter/
+│       └── leaderboard.json
+```
+
+### Understanding Metrics
+
+Key performance indicators in benchmark results:
+
+- **Win Rate**: Percentage of games won
+- **Average Score**: Mean final score across runs
+- **Consistency**: Standard deviation of scores
+- **Efficiency**: Score per ante progression
+- **Strategy Adherence**: How well the bot follows strategy guidelines
+
+## BalatroBench Integration
+
+### Overview
+
+[BalatroBench](https://s1m0n38.github.io/balatrobench/) is a web-based dashboard for visualizing and comparing LLM performance in Balatro. It provides interactive charts, leaderboards, and detailed analytics.
+
+*[Screenshot placeholder: BalatroBench dashboard showing model comparison]*
+
+### Uploading Results
+
+Integrate your local benchmark results with BalatroBench:
+
+```bash
+# Generate benchmarks locally
+balatrollm --runs 20 benchmark
+
+# Upload to BalatroBench (coming soon)
+balatrollm benchmark --upload
+
+# Or manually copy results to BalatroBench format
+cp benchmarks/v0.10.0/default/leaderboard.json /path/to/balatrobench/data/
+```
+
+### Viewing Results
+
+Access comprehensive analytics through the web interface:
+
+1. **Model Comparison**: Side-by-side performance metrics
+2. **Strategy Analysis**: How different strategies perform across models
+3. **Trend Analysis**: Performance changes over time
+4. **Detailed Breakdowns**: Ante-by-ante progression analysis
+
+*[Screenshot placeholder: Model comparison view in BalatroBench]*
+
+## Local Analysis
+
+### Command-Line Analysis
+
+Analyze results directly from the command line:
+
+```bash
+# View latest benchmark summary
+cat benchmarks/v0.10.0/default/leaderboard.json | jq
+
+# Compare models
+jq '.models[] | {name: .model, win_rate: .metrics.win_rate, avg_score: .metrics.avg_score}' \
+  benchmarks/v0.10.0/default/leaderboard.json
+
+# Find top performer
+jq '.models | sort_by(.metrics.avg_score) | reverse | .[0]' \
+  benchmarks/v0.10.0/default/leaderboard.json
+```
+
+### Custom Analysis Scripts
+
+Create custom analysis for specific insights:
+
+```bash
+# Calculate model efficiency (score per run)
+find benchmarks -name "*.json" -not -name "leaderboard.json" | \
+  xargs jq -r '[.model, (.total_score / .total_runs)] | @csv'
+
+# Compare strategies for same model
+diff <(jq '.models[] | select(.model=="gpt-oss-20b") | .metrics' \
+       benchmarks/v0.10.0/default/leaderboard.json) \
+     <(jq '.models[] | select(.model=="gpt-oss-20b") | .metrics' \
+       benchmarks/v0.10.0/aggressive/leaderboard.json)
+```
+
+## Performance Tracking
+
+### Continuous Monitoring
+
+Set up automated benchmarking:
+
+```bash
+# Daily benchmark script
+#!/bin/bash
+DATE=$(date +%Y%m%d)
+balatrollm --runs 5 --runs-dir "daily_benchmarks/$DATE" benchmark
+```
+
+### Regression Testing
+
+Monitor performance across versions:
+
+```bash
+# Compare current version to previous
+jq '.models[] | {model, current: .metrics.avg_score}' \
+  benchmarks/v0.10.0/default/leaderboard.json > current.json
+
+jq '.models[] | {model, previous: .metrics.avg_score}' \
+  benchmarks/v0.9.0/default/leaderboard.json > previous.json
+
+# Join and compare
+jq -s 'add | group_by(.model) | map(add)' current.json previous.json
+```
+
+## Interpreting Results
+
+### Statistical Significance
+
+Ensure reliable results:
+
+```bash
+# Run sufficient samples for confidence
+balatrollm --runs 30 benchmark  # Minimum recommended
+
+# Check variance in results
+jq '.detailed_runs[] | .final_score' benchmarks/latest/model.json | \
+  awk '{sum+=$1; sumsq+=$1*$1} END {print "Mean:", sum/NR, "StdDev:", sqrt((sumsq-sum*sum/NR)/NR)}'
+```
+
+### Model Selection Criteria
+
+Choose models based on your priorities:
+
+- **Consistency**: Low standard deviation in scores
+- **Peak Performance**: Highest maximum scores achieved
+- **Win Rate**: Reliability in completing games successfully
+- **Speed**: Faster response times for real-time applications
+
+### Strategy Optimization
+
+Use results to refine strategies:
+
+```bash
+# Identify successful patterns
+jq '.detailed_runs[] | select(.final_score > 8000) | .strategy_decisions' \
+  benchmarks/v0.10.0/aggressive/openrouter/gpt-oss-120b.json
+
+# Find failure modes
+jq '.detailed_runs[] | select(.final_score < 2000) | .failure_reason' \
+  benchmarks/v0.10.0/default/openrouter/gpt-oss-20b.json
+```
diff --git a/docs/index.md b/docs/index.md
@@ -0,0 +1,47 @@
+# BalatroLLM
+
+**LLM-powered bot that plays Balatro using strategic decision making**
+
+______________________________________________________________________
+
+!!! warning "Pre-1.0 Development Notice"
+
+    This project is currently in pre-1.0 development phase. According to [Semantic Versioning](https://semver.org/) specification, minor version updates (0.x.y → 0.(x+1).0) may introduce breaking changes. Please review release notes carefully before upgrading.
+
+BalatroLLM is an intelligent bot that leverages Large Language Models to play Balatro, the popular roguelike poker deck-building game. The bot uses OpenAI-compatible APIs to communicate with various LLM providers and makes strategic decisions based on comprehensive game state analysis. Whether you're running benchmarks across different models or exploring AI gaming strategies, BalatroLLM provides a robust framework for automated Balatro gameplay.
+
+<div class="grid cards" markdown>
+
+- :material-cog:{ .lg .middle } __Setup__
+
+    ---
+
+    Installation guide covering dependencies, environment setup, and API key configuration.
+
+    [:octicons-arrow-right-24: Setup](setup.md)
+
+- :material-play:{ .lg .middle } __Usage__
+
+    ---
+
+    Learn how to run the bot, configure strategies, and customize gameplay parameters.
+
+    [:octicons-arrow-right-24: Usage](usage.md)
+
+- :material-chart-line:{ .lg .middle } __Analysis__
+
+    ---
+
+    Generate benchmarks, analyze performance metrics, and integrate with BalatroBench for comprehensive statistics.
+
+    [:octicons-arrow-right-24: Analysis](analysis.md)
+
+- :octicons-sparkle-fill-16:{ .lg .middle } __Documentation for LLM__
+
+    ---
+
+    Documentation in [llms.txt](https://llmstxt.org/) format. Just paste the following link (or its content) into the LLM chat.
+
+    [:octicons-arrow-right-24: llms-full.txt](llms-full.txt)
+
+</div>
diff --git a/docs/setup.md b/docs/setup.md
@@ -0,0 +1,135 @@
+# Setup
+
+This guide will help you install and configure BalatroLLM for running LLM-powered Balatro bots.
+
+## Prerequisites
+
+- **Python 3.13+**: BalatroLLM requires Python 3.13 or later
+- **Balatro Game**: You need a copy of Balatro installed
+- **BalatroBot**: The underlying framework for Balatro automation
+- **API Access**: An API key for LLM providers (OpenRouter recommended)
+
+## Installation
+
+### 1. Install BalatroLLM
+
+```bash
+# Clone the repository
+git clone https://github.com/S1M0N38/balatrollm.git
+cd balatrollm
+
+# Install with uv (recommended)
+uv sync --all-extras --group dev
+
+# Or install with pip
+pip install -e .
+```
+
+### 2. Set up BalatroBot
+
+BalatroLLM depends on BalatroBot for game communication. Follow the [BalatroBot installation guide](https://s1m0n38.github.io/balatrobot/installation/) to:
+
+1. Install the BalatroBot Steamodded mod
+2. Configure Balatro for bot communication
+3. Verify the setup works
+
+### 3. Configure Environment Variables
+
+Create a `.envrc` file in the project root:
+
+```bash
+# Copy the example file
+cp .envrc.example .envrc
+
+# Edit with your API key
+export OPENROUTER_API_KEY="your-api-key-here"
+
+# Load the environment
+source .envrc
+```
+
+### 4. Verify Installation
+
+Test that everything is working:
+
+```bash
+# Check available models
+balatrollm --list-models
+
+# Test bot connectivity (requires Balatro running)
+balatrollm --help
+```
+
+## API Key Setup
+
+### OpenRouter (Recommended)
+
+OpenRouter provides access to multiple LLM providers through a single API:
+
+1. Sign up at [openrouter.ai](https://openrouter.ai)
+2. Generate an API key
+3. Add to your `.envrc` file as `OPENROUTER_API_KEY`
+
+### Other Providers
+
+BalatroLLM supports any OpenAI-compatible API:
+
+```bash
+# Use custom provider
+balatrollm --base-url https://api.your-provider.com/v1 --api-key your-key
+```
+
+## Game Setup
+
+### Start Balatro
+
+Use the provided script to launch Balatro with bot support:
+
+```bash
+# Start single instance on default port 12346
+./balatro.sh
+
+# Start with custom port
+./balatro.sh -p 12347
+
+# Start multiple instances for parallel runs
+./balatro.sh -p 12346 -p 12347
+
+# Start in headless mode for servers
+./balatro.sh --headless --fast
+```
+
+### Verify Connection
+
+Check that BalatroLLM can connect to the game:
+
+```bash
+# Run a quick test (will exit after connection)
+balatrollm --runs 1
+```
+
+## Troubleshooting
+
+### Connection Issues
+
+If the bot can't connect to Balatro:
+
+1. Ensure Balatro is running with the BalatroBot mod
+2. Check that the port matches (default: 12346)
+3. Verify firewall settings allow local connections
+
+### API Issues
+
+If you get API errors:
+
+1. Verify your API key is correct
+2. Check your account balance/credits
+3. Test with a different model using `--model`
+
+### Performance Issues
+
+For better performance:
+
+1. Use `--fast` mode in the Balatro script
+2. Run multiple instances in parallel
+3. Choose faster models for initial testing
diff --git a/docs/usage.md b/docs/usage.md