5

In my mind, this means that each tree just takes one feature, and produces a step function based upon it.

In the limit of n_estimators being very large and max_depth=1, does xgboost become a linear model?

On my dataset, a gridsearch found max_depth to be 1, so I'm wondering if I should be building a linear model instead.

cjm2671
  • 284
  • 2
  • 8

1 Answers1

5

A decision tree model is a non-linear mapping from x to y where XGBoost (or LightGBM) is a level-wise decision-tree ensembling algorithm, so your model will still be nonlinear with max_depth = 1. But max_depth = 1 will most probably block your algorithm from your model getting complex enough to capture complex patterns from the data, since you permit only for one split of the tree. Max_depth is used for preventing overfitting, to avoid tree growing very deep.

I suggest you looking at XGBoost documentation, since tree parameters have a strong correlation with each others upon hyperparameter tuning in the way: https://xgboost.readthedocs.io/en/latest/parameter.html

Ugur MULUK
  • 490
  • 3
  • 8