Q&A 39.1 — FeatureAttributor Configuration & Edge Cases #794

web3guru888 · 2026-04-13T20:32:44Z

web3guru888
Apr 13, 2026
Maintainer

FeatureAttributor — Configuration Questions

Q: How many background samples should KernelSHAP use?

A: Lundberg & Lee recommend ≥100 samples for stable estimates. For high-dimensional data (>50 features), use 500–2000 samples. The convergence rate is O(1/√K) where K = num_samples. Monitor the standard error of Shapley values and increase samples until SE < 0.01 × max|φᵢ|.

Q: Which baseline should Integrated Gradients use?

A: The zero baseline works for images and normalized tabular data. For text, use a padding/mask token embedding. For distributions with meaningful means, use the training set mean. Sturmfels et al. (2020) recommend Expected Gradients (averaging over multiple baselines sampled from the data distribution) for robustness.

Q: How does LIME handle categorical features?

A: LIME perturbs categorical features by sampling from the empirical distribution of each category in the training set. Set categorical_features in AttributionConfig. For high-cardinality categories, group rare categories first. The kernel width σ should be tuned via cross-validation on a held-out explanation set.

Q: When do different attribution methods disagree?

A: Disagreement is common for correlated features — SHAP distributes importance across correlated features (Shapley fairness), while LIME may concentrate on one. Integrated Gradients can saturate for ReLU networks. Use the compare_methods API to identify disagreements and the rank-correlation aggregator to find consensus.

Q: How to handle very high-dimensional inputs (images, text)?

A: For images: use superpixel segmentation (SLIC) to reduce dimensionality before SHAP/LIME. For text: attribute at the token level with Integrated Gradients, or use SHAP with a token-masking perturbation strategy. Set feature_groups in config for grouped attribution.

Q: What is the computational cost?

Method	Time Complexity	Typical Latency
KernelSHAP	O(K × model_inference)	2–10s
TreeSHAP	O(T × L × D)	<100ms
LIME	O(N × model_inference)	1–5s
Integrated Gradients	O(m × backprop)	0.5–2s
DeepLIFT	O(1 × backprop)	<100ms

Related: #788

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q&A 39.1 — FeatureAttributor Configuration & Edge Cases #794

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Q&A 39.1 — FeatureAttributor Configuration & Edge Cases #794

Uh oh!

web3guru888 Apr 13, 2026 Maintainer

FeatureAttributor — Configuration Questions

Q: How many background samples should KernelSHAP use?

Q: Which baseline should Integrated Gradients use?

Q: How does LIME handle categorical features?

Q: When do different attribution methods disagree?

Q: How to handle very high-dimensional inputs (images, text)?

Q: What is the computational cost?

Replies: 0 comments

web3guru888
Apr 13, 2026
Maintainer