Skip to content

Computing Values with Regression-based estimators is inefficient. #340

@mmschlk

Description

@mmschlk

The computation with regression-based estimators is quite slow at the moment. There are multiple inefficiencies in the regression routine which should be optimized:

The following graph shows the computation of 2-SII scores with KernelSHAP-IQ for 1_000_000 sampled coalitions for an image classifier model with around 60 features: The time it takes to query the model is considerably faster than the computation which happens afterwards.

Potential Bottlenecks:

This:

        for coalition_pos, coalition in enumerate(coalitions_matrix):
            for interaction_pos, interaction in enumerate(
                powerset(self._grand_coalition_set, max_size=self.max_order)
            ):
                interaction_size = len(interaction)
                intersection_size = np.sum(coalition[list(interaction)])
                regression_matrix[coalition_pos, interaction_pos] = regression_coefficient_weight[
                    interaction_size, intersection_size
                ]

The Try-Except: (which is linked to #338)

        try:
            # try solving via solve function
            shapley_interactions_values = np.linalg.solve(
                regression_matrix.T @ weighted_regression_matrix,
                weighted_regression_matrix.T @ regression_response,
            )
        except np.linalg.LinAlgError:
            # solve WLSQ via lstsq function and throw warning
            regression_weights_sqrt_matrix = np.diag(np.sqrt(regression_weights))
            regression_lhs = np.dot(regression_weights_sqrt_matrix, regression_matrix)
            regression_rhs = np.dot(regression_weights_sqrt_matrix, regression_response)
            warnings.warn(
                UserWarning(
                    "Linear regression equation is singular, a least squares solutions is used "
                    "instead.\n"
                )
            )
            shapley_interactions_values = np.linalg.lstsq(
                regression_lhs, regression_rhs, rcond=None
            )[0]

Metadata

Metadata

Assignees

Labels

approximator 🔧all issues that are linked to approximatorsfeature 💡New feature or enhancement requesthelp wanted 🙏Extra attention is needed

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions