|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +`shapiq` is a Python library for computing Shapley Interactions for Machine Learning. It approximates any-order Shapley interactions, benchmarks game-theoretical algorithms, and explains feature interactions in model predictions. The repo contains two importable packages: `shapiq` (core) and `shapiq_games` (benchmark games). |
| 8 | + |
| 9 | +## Development Setup |
| 10 | + |
| 11 | +This project uses `uv` for package management. |
| 12 | + |
| 13 | +```sh |
| 14 | +# Install dev dependencies (test + lint + all_ml) |
| 15 | +uv sync |
| 16 | + |
| 17 | +# Install only test dependencies |
| 18 | +uv sync --group test |
| 19 | + |
| 20 | +# Install lint tools |
| 21 | +uv sync --group lint |
| 22 | +``` |
| 23 | + |
| 24 | +## Commands |
| 25 | + |
| 26 | +### Testing |
| 27 | + |
| 28 | +```sh |
| 29 | +# Run all shapiq unit tests |
| 30 | +uv run pytest tests/shapiq |
| 31 | + |
| 32 | +# Run all shapiq_games tests (parallel) |
| 33 | +uv run pytest tests/shapiq_games -n logical |
| 34 | + |
| 35 | +# Run a single test file |
| 36 | +uv run pytest tests/shapiq/tests_unit/test_interaction_values.py |
| 37 | + |
| 38 | +# Run with coverage |
| 39 | +uv run pytest tests/shapiq --cov=shapiq --cov-report=xml -n logical |
| 40 | +``` |
| 41 | + |
| 42 | +### Linting and Code Quality |
| 43 | + |
| 44 | +```sh |
| 45 | +# Run all pre-commit hooks (ruff lint + format + ty) |
| 46 | +uv run pre-commit run --all-files |
| 47 | + |
| 48 | +# Run ruff linter only |
| 49 | +uv run ruff check src/ tests/ --fix |
| 50 | + |
| 51 | +# Run ruff formatter only |
| 52 | +uv run ruff format src/ tests/ |
| 53 | + |
| 54 | +# Run type checking |
| 55 | +uv run ty check |
| 56 | +``` |
| 57 | + |
| 58 | +### Documentation |
| 59 | + |
| 60 | +```sh |
| 61 | +uv sync --no-dev --group docs |
| 62 | +uv run sphinx-build -b html docs/source docs/build/html |
| 63 | +``` |
| 64 | + |
| 65 | +## Code Architecture |
| 66 | + |
| 67 | +### Package Structure |
| 68 | + |
| 69 | +``` |
| 70 | +src/ |
| 71 | +├── shapiq/ # Core package |
| 72 | +│ ├── interaction_values.py # InteractionValues data class (central output type) |
| 73 | +│ ├── game.py # Base Game class for cooperative games |
| 74 | +│ ├── approximator/ # Approximation algorithms |
| 75 | +│ │ ├── base.py # Approximator base class |
| 76 | +│ │ ├── marginals/ # Owen, Stratified sampling |
| 77 | +│ │ ├── montecarlo/ # SHAPIQ, SVARM, SVARMIQ, UnbiasedKernelSHAP |
| 78 | +│ │ ├── permutation/ # Permutation sampling for SII, STII, SV |
| 79 | +│ │ ├── regression/ # KernelSHAP, KernelSHAPIQ, RegressionFSII/FBII |
| 80 | +│ │ └── sparse/ # SPEX, ProxySPEX (for large feature spaces) |
| 81 | +│ ├── explainer/ # High-level explainer interfaces |
| 82 | +│ │ ├── tabular.py # TabularExplainer (main user-facing class) |
| 83 | +│ │ ├── tree/ # TreeExplainer with model-specific conversions |
| 84 | +│ │ └── product_kernel/ # ProductKernelExplainer |
| 85 | +│ ├── imputer/ # Imputation strategies for missing features |
| 86 | +│ │ ├── marginal_imputer.py # MarginalImputer (most common) |
| 87 | +│ │ ├── baseline_imputer.py |
| 88 | +│ │ ├── gaussian_imputer.py |
| 89 | +│ │ └── tabpfn_imputer.py |
| 90 | +│ ├── game_theory/ # Mathematical game-theory utilities |
| 91 | +│ │ ├── exact.py # ExactComputer for exact interaction values |
| 92 | +│ │ ├── indices.py # ALL_AVAILABLE_CONCEPTS index registry |
| 93 | +│ │ ├── moebius_converter.py |
| 94 | +│ │ └── aggregation.py |
| 95 | +│ ├── plot/ # Visualization functions |
| 96 | +│ └── utils/ # Shared utilities (sets, saving, typing) |
| 97 | +└── shapiq_games/ # Benchmark games package (separate from shapiq) |
| 98 | + ├── benchmark/ # Pre-defined benchmark games per use-case |
| 99 | + ├── synthetic/ # Synthetic game functions |
| 100 | + └── tabular/ # Tabular ML games |
| 101 | +``` |
| 102 | + |
| 103 | +### Core Data Flow |
| 104 | + |
| 105 | +1. **Game** (`game.py`): Wraps any callable as a cooperative game. Subclasses implement `value_function(coalitions) -> np.ndarray`. Takes a boolean coalition matrix and returns scalar game values. |
| 106 | + |
| 107 | +2. **Approximator** (`approximator/`): Takes a `Game` and a budget, calls `approximate(budget, game)` → returns `InteractionValues`. All approximators inherit from `Approximator` base class. |
| 108 | + |
| 109 | +3. **InteractionValues** (`interaction_values.py`): Central data class storing interaction scores as a numpy array with an `interaction_lookup` dict mapping coalition tuples → array indices. Supports arithmetic operations between instances. |
| 110 | + |
| 111 | +4. **Explainer** (`explainer/`): High-level interface combining an ML model + data + an `Imputer` into a `Game`, then calling an `Approximator`. `Explainer.explain(x)` → `InteractionValues`. |
| 112 | + |
| 113 | +5. **Imputer** (`imputer/`): Converts ML model + data into a game by handling missing features. `MarginalImputer` is the default for tabular data. |
| 114 | + |
| 115 | +### Interaction Indices |
| 116 | + |
| 117 | +Available indices are defined in `game_theory/indices.py` (`ALL_AVAILABLE_CONCEPTS`). Key ones: |
| 118 | +- `SV` – Shapley Values (order 1 only) |
| 119 | +- `SII` – Shapley Interaction Index |
| 120 | +- `k-SII` – k-Shapley Interaction Index (most common for explanations) |
| 121 | +- `STII` – Shapley-Taylor Interaction Index |
| 122 | +- `FSII` – Faithful Shapley Interaction Index |
| 123 | +- `FBII` – Faithful Banzhaf Interaction Index |
| 124 | +- `BV` – Banzhaf Values |
| 125 | + |
| 126 | +### Code Style |
| 127 | + |
| 128 | +- **Formatter/Linter**: `ruff` with `black` style, line length 100, Google-style docstrings |
| 129 | +- **Type checking**: `ty` (checks `src/shapiq/`, excluded for tests) |
| 130 | +- **All files** must start with `from __future__ import annotations` |
| 131 | +- `isort` is configured with `required-imports = ["from __future__ import annotations"]` |
| 132 | +- Variable names `X` (uppercase) in functions are allowed (common in ML code) |
| 133 | +- Test files live in `tests/shapiq/` and `tests/shapiq_games/` with separate conftest files |
| 134 | + |
| 135 | +### Test Organization |
| 136 | + |
| 137 | +- `tests/shapiq/tests_unit/` – Unit tests per module |
| 138 | +- `tests/shapiq/tests_integration_tests/` – Integration tests |
| 139 | +- `tests/shapiq/tests_deprecation/` – Deprecation behavior tests |
| 140 | +- `tests/shapiq/fixtures/` – Shared pytest fixtures (data, games, models, interaction values) |
| 141 | +- `tests/shapiq_games/` – Tests for the `shapiq_games` package |
| 142 | + |
| 143 | +### Two-Package Setup |
| 144 | + |
| 145 | +The repo hosts two installable packages: |
| 146 | +- `shapiq` in `src/shapiq/` — the core library |
| 147 | +- `shapiq_games` in `src/shapiq_games/` — optional benchmark games requiring extra ML dependencies (`torch`, `transformers`, `tabpfn`) |
| 148 | + |
| 149 | +`shapiq_games` requires `uv sync --group all_ml` for full functionality. |
0 commit comments