boost_from_average does nothing for custom objectives, and BoostFromScore cannot be set through model params for custom objective

## Description

When using a custom objective function in Python, `boost_from_average=True` silently has no effect.  

This failure is silent (no warning is emitted). 

There is a comment in gbdt.cpp to implement this functionality, but it is currently not implemented in the Python API, as there is no way to implement a custom "average":

```
/* If the custom "average" is implemented it will be used in place of the label average (if enabled)
*
* An improvement to this is to have options to explicitly choose
* (i) standard average
* (ii) custom average if available
* (iii) any user defined scalar bias (e.g. using a new option "init_score" that overrides (i) and (ii) )
*
* (i) and (ii) could be selected as say "auto_init_score" = 0 or 1 etc..
*
*/
```

This is relevant to https://github.com/lightgbm-org/LightGBM/issues/2558 and https://github.com/lightgbm-org/LightGBM/issues/3571

### Code path

The full call chain makes the issue clear:
1. `lgb.train()` with `boost_from_average=True` triggers `GBDT::BoostFromAverage` in `gbdt.cpp`:
```cpp
if (config_->boost_from_average || (train_data_ != nullptr && train_data_->num_features() == 0)) {
    double init_score = ObtainAutomaticInitialScore(objective_function_, class_id);
    if (std::fabs(init_score) > kEpsilon) {
        train_score_updater_->AddScore(init_score, class_id);
        ...
    }
}
```

2. `ObtainAutomaticInitialScore` in `gbdt.cpp` calls `BoostFromScore` on whatever objective is active:
```cpp
cppdouble ObtainAutomaticInitialScore(const ObjectiveFunction* fobj, int class_id) {
  double init_score = 0.0;
  if (fobj != nullptr) {
    init_score = fobj->BoostFromScore(class_id);
  }
  if (Network::num_machines() > 1) {
    init_score = Network::GlobalSyncUpByMean(init_score);
  }
  return init_score;
}
```

3. `BoostFromScore` in `objective_function.h` has this base class default:
```cpp
virtual double BoostFromScore(int /*class_id*/) const { return 0.0; }
```

Built-in objectives (e.g. regression_l2) override this to return the label mean. Custom Python objectives cannot override it. 

## Reproducible example

The following shows that setting init_score has the same result - perhaps update docs to clarify this:
```python
import pandas as pd 
import numpy as np
import requests
import io

from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
import lightgbm as lgb

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx"
response = requests.get(url)
df = pd.read_excel(io.BytesIO(response.content), engine="openpyxl")

X = df[[col for col in df.columns if col[0]=='X']]
y = df['Y1']

X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, random_state=42, shuffle=True)

dtrain = lgb.Dataset(X_train, label=y_train, categorical_feature=['X6', 'X8'])

def mse(y_pred: np.ndarray, dataset: lgb.Dataset):
    y_true = dataset.get_label()
    residual = y_true - y_pred
    grad = -residual 
    hess = np.ones_like(grad)
    return grad, hess

# 1. Built-in regression with boost_from_average=True (default)
bfa_model = lgb.train(
    {'objective': 'regression', 'boost_from_average': True,
     'seed': 42, 'verbose': -1},
    dtrain,
)

# 2. Built-in regression with boost_from_average=False
no_bfa_model = lgb.train(
    {'objective': 'regression', 'boost_from_average': False,
     'seed': 42, 'verbose': -1},
    dtrain,
)

# 3. Custom MSE, no init score — starts from 0
custom_mse_model = lgb.train(
    {'objective': mse, 'seed': 42, 'verbose': -1},
    dtrain,
)

# 4. Custom MSE, manually replicate boost_from_average via set_init_score
custom_mse_init_model = lgb.train(
    {'objective': mse, 'seed': 42, 'verbose': -1},
    dtrain.set_init_score(np.full(len(y_train), init_score))
)

bfa_preds                = bfa_model.predict(X_val)
no_bfa_preds             = no_bfa_model.predict(X_val)
custom_mse_preds         = custom_mse_model.predict(X_val)
# must manually add init_score back — predict() is unaware of dataset init score
custom_mse_init_preds    = custom_mse_init_model.predict(X_val) + init_score

print(f"BFA (built-in, boost_from_average=True):   {r2_score(y_val, bfa_preds):.8f}")
print(f"Custom MSE (manual init score, same as BFA):             {r2_score(y_val, custom_mse_init_preds):.8f}")
print(f"No BFA (built-in, boost_from_average=False): {r2_score(y_val, no_bfa_preds):.8f}")
print(f"Custom MSE (no init score, same as no BFA):                 {r2_score(y_val, custom_mse_preds):.8f}")
```
## Environment info

Python 3.14
LightGBM 4.6.0

## Additional Comments
Suggested fix 1: 
Add `objective_initial_score` to params, and allow it to be: 
1. 'mean' (weighted mean), 
2. 'median' (weighted median with alpha=0.5),
3. 'sigmoid' (for binary/multi-label classification), 
4. 'log-odds' (for multinomial classification), 
5. float between 0-1 for a weighted percentile, or 
6. a function for a custom average. 

Options 1-5 are already implemented in their relevant objective.hpp files. 

Suggested fix 2:
Include clearer documentation in the "boost_from_average" that setting init_score in dataset has the same effect. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

boost_from_average does nothing for custom objectives, and BoostFromScore cannot be set through model params for custom objective #7193

Description

Code path

Reproducible example

Environment info

Additional Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

boost_from_average does nothing for custom objectives, and BoostFromScore cannot be set through model params for custom objective #7193

Description

Description

Code path

Reproducible example

Environment info

Additional Comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions