-
-
Notifications
You must be signed in to change notification settings - Fork 8.9k
[Roadmap] Multiple outputs. #9043
Description
Since XGBoost 1.6, we have been working on having multi-output support for the tree model. In 2.0, we will have the initial implementation for the vector-leaf-based multi-output model. This issue serves as a tracker for future development and related discussion. The original feature request is here: #2087 . The related features are for vector-leaf rather than general multi-output.
Feel free to share your suggestions or make related feature requests in the comments.
Implementation Optimization
- Use f-order for the gradient. Currently, the gradient has one column for each target but is written in C-order. The transformation takes about one-fifth of the training time. (Use matrix for gradient. #9508)
- Use f-order for the custom objective. (Inefficient Casting Grad and Hess to c_float for Custom Obj in Python API #9089)
- Improve array type dispatching by moving the dispatch logic from per-element to per-array. This enables us to have a more efficient custom objective interface. (Optimize array interface input. #9090)
Algorithmic Optimization
We are still looking for potential algorithmic optimization for vector-leaf and here's the pool of candidates. We need to survey all available options. Feel free to share if you have ideas or paper recommendations.
- Sketch boost. ([mt] Split up gradient types for the GPU hist. #11798, [mt] Implement reduced gradient for the CPU hist. #11922)
- https://arxiv.org/abs/2201.06239 ([mt] Split up gradient types for the GPU hist. #11798, [mt] Implement reduced gradient for the CPU hist. #11922)
- Extra tree.
(#11798)
GPU Implementation
- Evaluation ([mt] Implement vector leaf for a decision stump on GPU. #11781, [mt] Support feature selection. #11883)
- Histogram ([mt] Implement vector leaf for a decision stump on GPU. #11781, [mt] Support building histogram with shared memory. #11855)
- Prediction (Replace the device model. #11752)
- Prediction cache. ([mt] Implement prediction cache. #11862)
- Model ([MT] Add device storage to multi-target tree. #11277)
- Partition. ([mt] Implement partitioning for GPU. #11789)
- Gradient sampling.
Documentation
- Derive the approximated Hessian in the context of boosting trees.
Multi-task
- Multi-task xgboost. This is not yet decided. I think it's wise to at least do some exploration before forging the rest of the implementation since we will have a very different interface if we need to consider multi-task. Related: [RFC] Exposing objectives and metrics as part of the API. #7693 .
Features
- Tree SHAP
- Plotting (Support graphviz plot for multi-target tree. #10093)
- Model text dump (JSON, txt, graphviz) (Support graphviz plot for multi-target tree. #10093, [mt] Implement model dump for all formats. #11747)
- Tree data frame.
- Categorical feature.
- Interaction constraints
- Subsample.
- Column sampling.
- Approx tree method
- Exact tree method
- Loss weight
- Feature importance (be careful with tree index) ([multi] Implement weight feature importance. #10700)
- Intercept. ([mt] Implement vector intercept. #11656)
Learning to rank
We can have a ranking model to consider multiple criteria. This might require multi-task to be supported.
Quantile regression
Distributed
- Dask
- PySpark
- Spark
- Flink?
- Federated (Support column split in multi-target
hist#9171)
Binding
- R ([R] Support multi-class custom objective. #9526)
- Scala
- Python
- Java
- C
HPO
- Check compatibility with major HPO frameworks.
Other extensions
- Sparse label. (multi-label classification optimization)
- Missing label.
- Early stopping for each target?
Applications
Benchmarks
- Collection of datasets for future comparison.