Q&A 39.1 — FeatureAttributor Configuration & Edge Cases #794
Unanswered
web3guru888
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
FeatureAttributor — Configuration Questions
Q: How many background samples should KernelSHAP use?
A: Lundberg & Lee recommend ≥100 samples for stable estimates. For high-dimensional data (>50 features), use 500–2000 samples. The convergence rate is O(1/√K) where K = num_samples. Monitor the standard error of Shapley values and increase samples until SE < 0.01 × max|φᵢ|.
Q: Which baseline should Integrated Gradients use?
A: The zero baseline works for images and normalized tabular data. For text, use a padding/mask token embedding. For distributions with meaningful means, use the training set mean. Sturmfels et al. (2020) recommend Expected Gradients (averaging over multiple baselines sampled from the data distribution) for robustness.
Q: How does LIME handle categorical features?
A: LIME perturbs categorical features by sampling from the empirical distribution of each category in the training set. Set
categorical_featuresin AttributionConfig. For high-cardinality categories, group rare categories first. The kernel width σ should be tuned via cross-validation on a held-out explanation set.Q: When do different attribution methods disagree?
A: Disagreement is common for correlated features — SHAP distributes importance across correlated features (Shapley fairness), while LIME may concentrate on one. Integrated Gradients can saturate for ReLU networks. Use the
compare_methodsAPI to identify disagreements and the rank-correlation aggregator to find consensus.Q: How to handle very high-dimensional inputs (images, text)?
A: For images: use superpixel segmentation (SLIC) to reduce dimensionality before SHAP/LIME. For text: attribute at the token level with Integrated Gradients, or use SHAP with a token-masking perturbation strategy. Set
feature_groupsin config for grouped attribution.Q: What is the computational cost?
Related: #788
Beta Was this translation helpful? Give feedback.
All reactions