Skip to content

Commit e405ef3

Browse files
mmschlkCopilot
andauthored
1.4.0 release (#462)
* moves sparse_transform imports into function calls * change ci pipeline * removes override calls * removes checkmarks because windows is sad and does not like colors :( * moves ProxySPEX up in the README.md * updated pyproject.toml * updated CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 0641cfd commit e405ef3

6 files changed

Lines changed: 114 additions & 60 deletions

File tree

.github/workflows/ci.yml

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,32 @@ jobs:
3131
# ----------------------------------------------------------------------------------------------
3232
install_and_import_shapiq:
3333
name: Install and import check shapiq
34-
runs-on: ubuntu-latest
34+
strategy:
35+
fail-fast: false
36+
matrix:
37+
include:
38+
- os: ubuntu-latest
39+
python-version: "3.10"
40+
- os: ubuntu-latest
41+
python-version: "3.13"
42+
- os: windows-latest
43+
python-version: "3.12"
44+
- os: macos-latest
45+
python-version: "3.12"
46+
runs-on: ${{ matrix.os }}
3547
steps:
3648
- uses: actions/checkout@v5
3749
- name: Set up Python and uv
3850
uses: astral-sh/setup-uv@v7
3951
with:
40-
python-version: "3.12"
52+
python-version: ${{ matrix.python-version }}
4153
enable-cache: true
4254
- name: Create uv virtual environment
4355
run: uv venv
4456
- name: Install shapiq package
4557
run: uv run --no-sync uv pip install .
4658
- name: Test import
47-
run: uv run --no-sync python -c "import shapiq; print('shapiq imported successfully')"
59+
run: uv run --no-sync python -c "import shapiq; print('shapiq imported successfully')"
4860
# ----------------------------------------------------------------------------------------------
4961
# Install and Import Check
5062
# ----------------------------------------------------------------------------------------------
@@ -65,7 +77,7 @@ jobs:
6577
- name: Install dependencies
6678
run: uv sync --no-dev --group all_ml
6779
- name: Test import of shapiq_games
68-
run: uv run --no-sync python -c "import shapiq_games; print('shapiq_games imported successfully')"
80+
run: uv run --no-sync python -c "import shapiq_games; print('shapiq_games imported successfully')"
6981
# ----------------------------------------------------------------------------------------------
7082
# Unit Tests with Matrix
7183
# ----------------------------------------------------------------------------------------------

CHANGELOG.md

Lines changed: 39 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,31 @@
11
# Changelog
22

3-
## Development
4-
5-
### Introducing ProxySPEX
6-
Adds the ProxySPEX approximator for efficient computation of sparse interaction values using the new ProxySPEX algorithm.
7-
For further details refer to: Butler, L., Kang, J.S., Agarwal, A., Erginbas, Y.E., Yu, Bin, Ramchandran, K. (2025). ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs https://arxiv.org/pdf/2505.17495
8-
9-
10-
### Introducing ProductKernelExplainer
11-
The ProductKernelExplainer is a new model-specific explanation method for Product Kernel based machine learning model, such as Gaussian Processes or Support Vector Machines.
12-
For further details refer to: https://arxiv.org/abs/2505.16516
3+
## v1.4.0 (2025-10-31)
4+
5+
### Introducing ProxySPEX [#442](https://github.com/mmschlk/shapiq/pull/442)
6+
Adds the [`ProxySPEX`](https://arxiv.org/pdf/2505.17495) [approximator](https://github.com/mmschlk/shapiq/blob/main/src/shapiq/approximator/sparse/proxyspex.py) for efficient computation of sparse interaction values using the new ProxySPEX algorithm.
7+
ProxySPEX is a direct extension of the [SPEX](https://openreview.net/pdf?id=UQpYmaBGwB) algorithm, which uses clever fourier representations of the value function and analysis to identify the most relevant interactions (in terms of `Moebius` coefficients) and transforms them into summary scores (Shapley interactions).
8+
One of the key innovations of ProxySPEX compared to SPEX is the use of a proxy model that approximates the original value function (uses a LightGBM model internally).
9+
**Notably,** to run ProxySPEX, users have to install the `lightgbm` package in their environment.
10+
For further details we refer to the paper, which will be presented at NeurIPS'2025: Butler, L., Kang, J.S., Agarwal, A., Erginbas, Y.E., Yu, Bin, Ramchandran, K. (2025). ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs. [arxiv](https://arxiv.org/pdf/2505.17495)
11+
12+
### Introducing ProductKernelExplainer [#431](https://github.com/mmschlk/shapiq/pull/431)
13+
The `ProductKernelExplainer` is a new model-specific explanation method for machine learning models that utilize Product Kernels, such as Gaussian Processes and Support Vector Machines.
14+
Similar to the TreeExplainer, it uses a specific computation scheme that leverages the structure of the underlying product kernels to efficiently compute exact Shapley values.
15+
**Note**, this explainer is only able to compute Shapley values (not higher-order interactions yet).
16+
For further details we refer to the paper: Mohammadi, M., Chau, S.-L., Muandet, K. Computing Exact Shapley Values in Polynomial Time for Product-Kernel Methods. [arxiv](https://arxiv.org/abs/2505.16516)
17+
18+
### New Conditional Imputation Methods [#435](https://github.com/mmschlk/shapiq/pull/435)
19+
Based on traditional statistical methods, we implemented two new conditional imputation methods named `GaussianImputer` and `GaussianCopulaImputer` within the `shapiq.imputer` module.
20+
Both imputation methods are designed to handle missing feature imputation in a way that respects the underlying data distribution with the assumption that the data follows a multivariate Gaussian distribution (`GaussianImputer`) or can be represented with Gaussian copulas (`GaussianCopulaImputer`).
21+
In practice, this assumption may often be violated, but these methods can still provide reasonable imputations in many scenarios and serve as a useful benchmark enabling easier research in the field of conditional imputation for Shapley value explanations.
1322

1423
### Shapiq Statically Typechecked [#430](https://github.com/mmschlk/shapiq/pull/430)
1524
We have introduced static type checking to `shapiq` using [Pyright](https://github.com/microsoft/pyright), and integrated it into our `pre-commit` hooks.
1625
This ensures that type inconsistencies are caught early during development, improving code quality and maintainability.
1726
Developers will now benefit from immediate feedback on type errors, making the codebase more robust and reliable as it evolves.
1827

19-
### Separation of `shapiq` into `shapiq`, `shapiq_games`, and `shapiq-benchmark`
28+
### Separation of `shapiq` into `shapiq`, `shapiq_games`, and `shapiq-benchmark` [#459](https://github.com/mmschlk/shapiq/issues/459)
2029
We have begun the process of modularizing the `shapiq` package by splitting it into three distinct packages: `shapiq`, `shapiq_games`, and `shapiq-benchmark`.
2130

2231
- The `shapiq` package now serves as the core library. It contains the main functionality, including approximators, explainers, computation routines, interaction value logic, and plotting utilities.
@@ -25,25 +34,32 @@ We have begun the process of modularizing the `shapiq` package by splitting it i
2534

2635
This restructuring aims to improve maintainability and development scalability. The core `shapiq` package will continue to receive the majority of updates and enhancements, and keeping it streamlined ensures better focus and usability. Meanwhile, separating games and benchmarking functionality allows these components to evolve more independently while maintaining compatibility through clearly defined dependencies.
2736

37+
### List of All New Features
38+
- adds the ProxySPEX (Proxy Sparse Explanation) module in `approximator.sparse` for even more efficient computation of sparse interaction values [#442](https://github.com/mmschlk/shapiq/pull/442)
39+
- uses `predict_logits` method of sklearn-like classifiers if available in favor of `predict_proba` to support models that also offer logit outputs like TabPFNClassifier for better interpretability of the explanations [#426](https://github.com/mmschlk/shapiq/issues/426)
40+
- adds the `shapiq.explainer.ProductKernelExplainer` for model-specific explanation of Product Kernel based models like Gaussian Processes and Support Vector Machines. [#431](https://github.com/mmschlk/shapiq/pull/431)
41+
- adds the `GaussianImputer` and `GaussianCopulaImputer` classes to the `shapiq.imputer` module for conditional imputation based on Gaussian assumptions. [#435](https://github.com/mmschlk/shapiq/pull/435)
42+
- speeds up the imputation process in `MarginalImputer` by dropping an unnecessary loop [#449](https://github.com/mmschlk/shapiq/pull/449)
43+
- makes `n_players` argument of `shapiq.ExactComputer` optional when a `shapiq.Game` object is passed [#388](https://github.com/mmschlk/shapiq/issues/388)
44+
45+
### Removed Features and Breaking Changes
46+
- removes the ability to load `InteractionValues` from pickle files. This is now deprecated and will be removed in the next release. Use `InteractionValues.save(..., as_json=True)` to save interaction values as JSON files instead. [#413](https://github.com/mmschlk/shapiq/issues/413)
47+
- removes `coalition_lookup` and `value_storage` properties from `shapiq.Game` since the seperated view on game values and coalitions they belong to is now outdated. Use the `shapiq.Game.game_values` dictionary instead. [#430](https://github.com/mmschlk/shapiq/pull/430)
48+
- reorders the arguments of `shapiq.ExactComputer`'s constructor to have `n_players` be optional if a `shapiq.Game` object is passed. [#388](https://github.com/mmschlk/shapiq/issues/388)
49+
50+
### Bugfixes
51+
- fixes a bug where RegressionFBII approximator was throwing an error when the index was `'BV'` or `'FBII'`.[#420](https://github.com/mmschlk/shapiq/pull/420)
52+
- allows subtraction and addition of `InteractionValues` objects with different `index` attributes by ignoring and raising a warning instead of an error. The resulting `InteractionValues` object will have the `index` of the first object. [#423](https://github.com/mmschlk/shapiq/pull/423)
53+
2854
### Maintenance and Development
2955
- refactored the `shapiq.Games` and `shapiq.InteractionValues` API by adding an interactions and game_values dictionary as the main data structure to store the interaction scores and game values. This allows for more efficient storage and retrieval of interaction values and game values, as well as easier manipulation of the data. [#419](https://github.com/mmschlk/shapiq/pull/419)
3056
- addition and subtraction of InteractionValues objects (via `shapiq.InteractionValues.__add__`) now also works for different indices, which will raise a warning and will return a new InteractionValues object with the index set of the first. [#422](https://github.com/mmschlk/shapiq/pull/422)
3157
- refactors the `shapiq.ExactComputer` to allow for initialization without passing n_players when a `shapiq.Game` object is passed [#388](https://github.com/mmschlk/shapiq/issues/388). Also introduces a tighter type hinting for the `index` parameter using `Literal` types. [#450](https://github.com/mmschlk/shapiq/pull/450)
58+
- removes zeros from the `InteractionValues.coalition_lookup` from the `MoebiusConverter` for better memory efficiency. [#369](https://github.com/mmschlk/shapiq/issues/369)
3259

3360
### Docs
3461
- added an example notebook for `InteractionValues`, highlighting *Initialization*, *Modification*, *Visualization* and *Save and Loading*.
35-
36-
### Bugfixes
37-
- fixes a bug where RegressionFBII approximator was throwing an error when the index was `'BV'` or `'FBII'`.[#420](https://github.com/mmschlk/shapiq/pull/420)
38-
39-
### All New Features
40-
- adds the ProxySPEX (Proxy Sparse Explanation) module in `approximator.sparse` for even more efficient computation of sparse interaction values [#442](https://github.com/mmschlk/shapiq/pull/442)
41-
- uses `predict_logits` method of sklearn-like classifiers if available in favor of `predict_proba` to support models that also offer logit outputs like TabPFNClassifier for better interpretability of the explanations [#426](https://github.com/mmschlk/shapiq/issues/426)
42-
- adds the `shapiq.explainer.ProductKernelExplainer` for model-specific explanation of Product Kernel based models like Gaussian Processes and Support Vector Machines. [#431](https://github.com/mmschlk/shapiq/pull/431)
43-
44-
### Removed Features
45-
- removes the ability to load `InteractionValues` from pickle files. This is now deprecated and will be removed in the next release. Use `InteractionValues.save(..., as_json=True)` to save interaction values as JSON files instead. [#413](https://github.com/mmschlk/shapiq/issues/413)
46-
- removes `coalition_lookup` and `value_storage` properties from `shapiq.Game` since the seperated view on game values and coalitions they belong to is now outdated. Use the `shapiq.Game.game_values` dictionary instead. [#430](https://github.com/mmschlk/shapiq/pull/430)
62+
- makes API reference docs more consistent by adding missing docstrings and improving existing ones across the package. [#420](https://github.com/mmschlk/shapiq/pull/420), [#437](https://github.com/mmschlk/shapiq/issues/437), [#452](https://github.com/mmschlk/shapiq/issues/452) among others.
4763

4864
## v1.3.2 (2025-10-14)
4965

README.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,28 @@ interaction_values.plot_force(feature_names=...)
117117
<img width="800px" src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/motivation_sv_and_si.png" alt="An example Force Plot for the California Housing Dataset with Shapley Interactions">
118118
</p>
119119

120+
### Use ProxySPEX (Proxy SParse EXplainer) <img src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/spex_logo.png" alt="spex_logo" align="right" height="75px"/>
121+
For large-scale use-cases you can also check out the [👓``ProxySPEX``](https://shapiq.readthedocs.io/en/latest/api/shapiq.approximator.sparse.html#shapiq.approximator.sparse.SPEX) approximator.
122+
123+
```python
124+
# load your data and model with large number of features
125+
data, model, n_features = ...
126+
127+
# use the ProxySPEX approximator directly
128+
approximator = shapiq.ProxySPEX(n=n_features, index="FBII", max_order=2)
129+
fbii_scores = approximator.approximate(budget=2000, game=model.predict)
130+
131+
# or use ProxySPEX with an explainer
132+
explainer = shapiq.Explainer(
133+
model=model,
134+
data=data,
135+
index="FBII",
136+
max_order=2,
137+
approximator="proxyspex" # specify ProxySPEX as approximator
138+
)
139+
explanation = explainer.explain(data[0])
140+
```
141+
120142
### Visualize feature interactions
121143

122144
A handy way of visualizing interaction scores up to order 2 are network plots.
@@ -162,28 +184,6 @@ fsii_values.plot_force() # plot the force plot
162184
<img width="800px" src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/fsii_tabpfn_force_plot_example.png" alt="Force Plot of FSII values as derived from the example tabpfn notebook">
163185
</p>
164186

165-
### Use ProxySPEX (Proxy SParse EXplainer) <img src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/spex_logo.png" alt="spex_logo" align="right" height="75px"/>
166-
For large-scale use-cases you can also check out the [👓``ProxySPEX``](https://shapiq.readthedocs.io/en/latest/api/shapiq.approximator.sparse.html#shapiq.approximator.sparse.SPEX) approximator.
167-
168-
```python
169-
# load your data and model with large number of features
170-
data, model, n_features = ...
171-
172-
# use the ProxySPEX approximator directly
173-
approximator = shapiq.ProxySPEX(n=n_features, index="FBII", max_order=2)
174-
fbii_scores = approximator.approximate(budget=2000, game=model.predict)
175-
176-
# or use ProxySPEX with an explainer
177-
explainer = shapiq.Explainer(
178-
model=model,
179-
data=data,
180-
index="FBII",
181-
max_order=2,
182-
approximator="proxyspex" # specify ProxySPEX as approximator
183-
)
184-
explanation = explainer.explain(data[0])
185-
```
186-
187187

188188
## 📖 Documentation with tutorials
189189
The documentation of ``shapiq`` can be found at https://shapiq.readthedocs.io.

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,13 @@ dependencies = [
2626
]
2727
authors = [
2828
{name = "Maximilian Muschalik", email = "Maximilian.Muschalik@lmu.de"},
29+
{name = "Santo M. A. R. Thies", email = "S.Thies@campus.lmu.de"},
2930
{name = "Hubert Baniecki"},
3031
{name = "Fabian Fumagalli"},
3132
]
3233
maintainers = [
3334
{name = "Maximilian Muschalik", email = "Maximilian.Muschalik@lmu.de"},
35+
{name = "Santo M. A. R. Thies", email = "S.Thies@campus.lmu.de"},
3436
]
3537
license = "MIT"
3638
classifiers = [

src/shapiq/imputer/gaussian_copula_imputer.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
from __future__ import annotations
44

55
from typing import TYPE_CHECKING, cast
6-
from typing_extensions import override
76

87
import numpy as np
98
from scipy.stats import norm, rankdata
@@ -35,7 +34,6 @@ class GaussianCopulaImputer(GaussianImputer):
3534
3635
More specifically, values will be clipped to the range ``[epsilon, 1 - epsilon]``."""
3736

38-
@override
3937
def __init__(
4038
self,
4139
model: (object | Game | Callable[[npt.NDArray[np.floating]], npt.NDArray[np.floating]]),
@@ -46,6 +44,26 @@ def __init__(
4644
random_state: int | None = None,
4745
verbose: bool = False,
4846
) -> None:
47+
"""Initializes the GaussianCopulaImputer.
48+
49+
Args:
50+
model: The model to explain as a callable function expecting a data points as input and
51+
returning the model's predictions.
52+
53+
data: The background data to use for the explainer as a two-dimensional array with shape
54+
``(n_samples, n_features)``.
55+
56+
x: The explanation point as a ``np.ndarray`` of shape ``(1, n_features)`` or
57+
``(n_features,)``.
58+
59+
sample_size: The number of Monte Carlo samples to draw from the conditional background
60+
data for imputation.
61+
62+
random_state: An optional random seed for reproducibility.
63+
64+
verbose: A flag to enable verbose imputation, which will print a progress bar for model
65+
evaluation. Note that this can slow down the imputation process.
66+
"""
4967
super().__init__(
5068
model=model,
5169
data=data,

src/shapiq/imputer/gaussian_imputer.py

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
from __future__ import annotations
44

55
from typing import TYPE_CHECKING, cast
6-
from typing_extensions import override
76

87
import numpy as np
98
from numpy.random import default_rng
@@ -47,14 +46,22 @@ def __init__(
4746
"""Initializes the class.
4847
4948
Args:
50-
model: The model to explain as a callable function expecting data points as input and
49+
model: The model to explain as a callable function expecting a data points as input and
5150
returning the model's predictions.
52-
data: The background data to use for the explainer as a ``np.ndarray`` of shape ``(n_samples, n_features)``.
53-
x: The explanation point as a ``np.ndarray`` of shape ``(1, n_features)`` or ``(n_features,)``. Defaults to ``None``.
54-
sample_size: Number of Monte Carlo samples for imputation. Defaults to ``100``.
55-
random_state: The random state to use for sampling. Defaults to ``None``.
56-
verbose: A flag to enable verbose imputation, which will print a progress bar for model evaluation.
57-
Note that this can slow down the imputation process. Defaults to ``False``.
51+
52+
data: The background data to use for the explainer as a two-dimensional array with shape
53+
``(n_samples, n_features)``.
54+
55+
x: The explanation point as a ``np.ndarray`` of shape ``(1, n_features)`` or
56+
``(n_features,)``.
57+
58+
sample_size: The number of Monte Carlo samples to draw from the conditional background
59+
data for imputation.
60+
61+
random_state: An optional random seed for reproducibility.
62+
63+
verbose: A flag to enable verbose imputation, which will print a progress bar for model
64+
evaluation. Note that this can slow down the imputation process.
5865
5966
Raises:
6067
CategoricalFeatureError: If the background data contains any categorical features.
@@ -207,7 +214,6 @@ def _sample_monte_carlo(
207214

208215
return samples_all_coalitions
209216

210-
@override
211217
def value_function(self, coalitions: npt.NDArray[np.bool]) -> npt.NDArray[np.floating]:
212218
"""Imputes the missing values of a data point and gets predictions for all coalitions.
213219

0 commit comments

Comments
 (0)