December 2018
Beginner to intermediate
684 pages
21h 9m
English
Gradient boosting models typically use decision trees to capture feature interaction, and the size of individual trees is the most important tuning parameter. XGBoost and CatBoost set the max_depth default to 6. In contrast, LightGBM uses a default num_leaves value of 31, which corresponds to five levels for a balanced tree, but imposes no constraints on the number of levels. To avoid overfitting, num_leaves should be lower than 2max_depth. For example, for a well-performing max_depth value of 7, you would set num_leaves to 70–80 rather than 27=128, or directly constrain max_depth.
The number of trees or boosting iterations defines the overall size of the ensemble. All libraries support early_stopping to abort training ...