December 2018
Beginner to intermediate
684 pages
21h 9m
English
The CatBoost and LightGBM implementations handle categorical variables directly without the need for dummy encoding.
The CatBoost implementation (which is named for its treatment of categorical features) includes several options to handle such features, in addition to automatic one-hot encoding, and assigns either the categories of individual features or combinations of categories for several features to numerical values. In other words, CatBoost can create new categorical features from combinations of existing features. The numerical values associated with the category levels of individual features or combinations of features depend on their relationship with the outcome value. In the classification case, ...