Skip to content

Periodic Features #4281

@Fish-Soup

Description

@Fish-Soup

Summary

Create periodic features such that there is no discontinuity in feature space where there shouldn't be one. E.g the 365 day of year should be adjacent to the 1st. I would imagine the API would work similar to the specification of categorical features with the additional component of the mimimum and maximum feature value that are equivalent. (E.g hour 0 and 24 of day are the same.)

Motivation

I primarily work with timeseries forecasting and common features we use are hour of day, day of week or day of year. If we take the day of year for example and I use the feature the the 365 day is not adjacent to the 1st day in feature space, but is it is in actuality. The model has to learn these days are likely to be similar rather than starting from that prior. I am often in the position where I have some data for Jan to say May, my prediction for December would have the day of year feature being built off May's data when in fact it should be more like Jan. This would also provide an additional constraint that should help the model fit better in the case of hour of day or day of week. Other periodic features could be angle.

Description

From a user perspective I imagine that we specify which features are periodic and what is the min and max feature values that are equivalent e.g 0 and 24 for hours. This could be done by passing a Dict[feature_name, Tuple[minval, maxval]] in the same way as the categorical features are defined.

Internally in the tree algorithm, in order to split the periodic feature, 2 leaf boundaries would have to be defined initially for a given feature, so the best pair of boudaries would be chosen. After which I imagine the algorithm working as it currently does.
In the hour of day example the optimum first split might be defined with hours 3 and 12, in which case one leaf is hour 3<h<12 and the other is 12<h<24 & 0<h<3.

If linear_tree=True then only one initial split would be required to fit a linear relationship.

References

Periodic constraints have been implemented in the pygam package.
https://pygam.readthedocs.io/en/latest/notebooks/tour_of_pygam.html

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions