Move internal shap algorithms into separate namespace. by RAMitchell · Pull Request #11985 · dmlc/xgboost

RAMitchell · 2026-02-04T08:50:00Z

No description provided.

RAMitchell · 2026-02-04T08:50:33Z

tests/cpp/predictor/test_shap.cc

              {"device", ctx->IsSycl() ? "cpu" : ctx->DeviceName()}};
 }
+
+gbm::GBTreeModel LoadGBTreeModel(Learner* learner, Context const* ctx,


It is quite annoying to get a GBTreeModel out of a booster.

Yeah, I would like to split up the concept of "model" and everything else like "optimizers/tree builders". But it might be too much for XGBoost at its current state.

src/interpretability/shap.cc

src/predictor/interpretability/shap.cc

RAMitchell · 2026-02-12T09:25:12Z

I am having two major difficulties with this PR. The categorical recoding logic is complicated and I don't want to carry this through to the interpretability module. The other is to simply get the tree ensemble out of learner which seems somehow impossible in the current setup.

trivialfis · 2026-02-12T10:47:43Z

I can look into simplifying the categorical features after holiday.

Copilot

Pull request overview

This PR refactors SHAP-related implementation details out of the CPU/GPU predictors into a new internal xgboost::interpretability namespace, aiming to reduce duplication and better isolate interpretability code paths.

Changes:

Introduces new internal SHAP dispatch API (interpretability::ShapValues, ShapInteractionValues, ApproxFeatureImportance) with CPU and CUDA implementations.
Refactors CPU/GPU predictors to delegate SHAP/interaction contribution computation to the new interpretability layer.
Extracts shared CPU/GPU data access utilities into dedicated headers and updates SHAP tests to use the new entry points (with reduced runtime).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`tests/cpp/predictor/test_shap.cc`	Updates SHAP tests to call new interpretability SHAP entry points and reduces dataset sizes/iters.
`src/predictor/interpretability/shap.h`	Adds internal SHAP dispatch header (CPU/CUDA routing).
`src/predictor/interpretability/shap.cc`	Adds CPU implementations for SHAP values, interactions, and approximate importance.
`src/predictor/interpretability/shap.cu`	Adds CUDA implementations for SHAP values/interactions using GPUTreeShap.
`src/predictor/gpu_predictor.cu`	Removes inlined GPU SHAP logic and delegates to interpretability SHAP functions; reuses extracted GPU accessors.
`src/predictor/gpu_data_accessor.cuh`	New shared GPU sparse/ellpack access helpers (used by GPU predictor and SHAP).
`src/predictor/data_accessor.h`	New shared CPU batch-to-FVec access helpers (used by CPU predictor and CPU SHAP).
`src/predictor/cpu_predictor.cc`	Removes inlined CPU SHAP logic and delegates to interpretability SHAP functions; uses extracted accessors.
`src/gbm/gbtree.h`	Formatting-only adjustments.
`src/gbm/gbtree.cc`	Formatting-only adjustments.
`include/xgboost/predictor.h`	Macro formatting-only adjustment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/predictor/interpretability/shap.h

src/predictor/interpretability/shap.cu

src/predictor/interpretability/shap.cc

src/predictor/cpu_predictor.cc

# Conflicts: # src/predictor/cpu_predictor.cc # src/predictor/gpu_predictor.cu

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-17T13:03:42Z

tests/cpp/predictor/test_shap.cc

+
  HostDeviceVector<float> shap_values;
-  learner->Predict(p_dmat, false, &shap_values, 0, 0, false, false, true, false, false);
+  interpretability::ShapValues(dmat->Ctx(), p_dmat.get(), &shap_values, *gbtree, 0, {}, 0, 0);


The tree_weights parameter is a pointer type (std::vector<float> const*), but the test is passing {} (an empty brace-initializer). This will not compile correctly. Pass nullptr instead to indicate no tree weights.

Copilot · 2026-02-17T13:03:42Z

tests/cpp/predictor/test_shap.cc

+  interpretability::ShapInteractionValues(dmat->Ctx(), p_dmat.get(), &shap_interactions, *gbtree, 0,
+                                          {}, false);


The tree_weights parameter is a pointer type (std::vector<float> const*), but the test is passing {} (an empty brace-initializer). This will not compile correctly. Pass nullptr instead to indicate no tree weights.

tests/cpp/predictor/test_shap.cc

trivialfis · 2026-02-20T08:01:37Z

I haven't looked into the details yet. But it would be great if we could establish some conventions here:

Do we need the gpu_ prefix for CUDA files? My preference is that the .cu(h) file extension is sufficient and doesn't need additional annotations.
Since you are trying to separate the SHAP module, should it be outside of the predictor directory?

RAMitchell · 2026-02-20T08:46:45Z

Since you are trying to separate the SHAP module, should it be outside of the predictor directory?

That was my first attempt but it depends on all of the data accessors in prediction, in particular recoding.

src/predictor/interpretability/shap.cc

trivialfis · 2026-02-22T02:45:04Z

The categorical recoding logic is complicated and I don't want to carry this through to the interpretability module.

Would you like to share a bit more on the complication from a caller's perspective for the SHAP module?

That was my first attempt but it depends on all of the data accessors in prediction, in particular recoding.

I think you have extracted them out?

trivialfis · 2026-02-22T04:16:53Z

Please let me know if you would like me to give the refactoring a try.

RAMitchell · 2026-02-22T15:56:16Z

I originally wanted to have a separate interpretability folder, but these new functions remain coupled to prediction. The shap values all need to add up to the margin prediction. You could put the data loading accessors elsewhere (data/utils) but I think they end up only used by shap and prediction. Let me know if you have better ideas.

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/predictor/gpu_data_accessor.cuh

Refactor tests

57590dc

RAMitchell commented Feb 4, 2026

View reviewed changes

Refactor SHAP dispatch and GPU implementation

5ef9c2e

trivialfis reviewed Feb 4, 2026

View reviewed changes

src/interpretability/shap.cc Outdated Show resolved Hide resolved

src/predictor/interpretability/shap.cc Outdated Show resolved Hide resolved

RAMitchell added 6 commits February 4, 2026 08:46

Remove some duplication

e361958

Merge branch 'master' of github.com:dmlc/xgboost into shap-interface

2bef572

Fix tests

7148086

Faster tests

8e75d36

Use span

9630ca1

Merge branch 'master' of github.com:dmlc/xgboost into shap-interface

35dfffa

RAMitchell added 4 commits February 15, 2026 09:06

Move shap stuff back into predictor folder

ea77731

Merge branch 'master' of github.com:dmlc/xgboost into shap-interface

db3cfc6

Undo use span for tree_weights

4deae84

Signed comparison error

7bca3de

RAMitchell requested a review from Copilot February 15, 2026 17:53

Copilot started reviewing on behalf of RAMitchell February 15, 2026 17:53 View session

R package build

37d7924

Copilot AI reviewed Feb 15, 2026

View reviewed changes

RAMitchell added 6 commits February 16, 2026 02:30

Copilot review

a8cf661

Categorical encoding

aa6cded

Merge remote-tracking branch 'upstream/master' into shap-interface

5393ea7

# Conflicts: # src/predictor/cpu_predictor.cc # src/predictor/gpu_predictor.cu

Update test

c7fc374

Clang-tidy

078e53b

Windows build

18de690

RAMitchell requested a review from Copilot February 17, 2026 12:54

Copilot started reviewing on behalf of RAMitchell February 17, 2026 12:54 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

Review comment

a72c391

RAMitchell changed the title ~~[WIP] Move internal shap algorithms into separate namespace.~~ Move internal shap algorithms into separate namespace. Feb 17, 2026

RAMitchell marked this pull request as ready for review February 17, 2026 13:44

RAMitchell requested a review from trivialfis February 19, 2026 09:31

trivialfis reviewed Feb 22, 2026

View reviewed changes

src/predictor/interpretability/shap.cc Outdated Show resolved Hide resolved

Dispatch on data type

b1df6ba

RAMitchell requested a review from Copilot February 22, 2026 16:37

Copilot started reviewing on behalf of RAMitchell February 22, 2026 16:37 View session

Copilot AI reviewed Feb 22, 2026

View reviewed changes

src/predictor/gpu_data_accessor.cuh Show resolved Hide resolved

trivialfis approved these changes Feb 24, 2026

View reviewed changes

RAMitchell merged commit 90fa894 into dmlc:master Feb 25, 2026
82 checks passed

		interpretability::ShapInteractionValues(dmat->Ctx(), p_dmat.get(), &shap_interactions, *gbtree, 0,
		{}, false);

Uh oh!

Conversation

RAMitchell commented Feb 4, 2026

Uh oh!

RAMitchell Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

trivialfis Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

RAMitchell commented Feb 12, 2026

Uh oh!

trivialfis commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

trivialfis commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RAMitchell commented Feb 20, 2026

Uh oh!

Uh oh!

trivialfis commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trivialfis commented Feb 22, 2026

Uh oh!

RAMitchell commented Feb 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

trivialfis commented Feb 20, 2026 •

edited

Loading

trivialfis commented Feb 22, 2026 •

edited

Loading