Gradient tree boosting (GTB)

Gradient boosting is another improved version of boosting. Like AdaBoost, it is based on a gradient descent function. The algorithm has proven to be one of the most proficient ones from the ensemble, though it is characterized by an increased variance of estimates, more sensibility to noise in data (both problems could be attenuated by using sub-sampling), and significant computational costs due to nonparallel operations.

To demonstrate how GTB performs, we will again try checking whether we can improve our predictive performance on the covertype dataset, which was already examined when illustrating linear SVM and ensemble algorithms:

In: import pickle    covertype_dataset = pickle.load(open("covertype_dataset.pickle", ...

Get Python Data Science Essentials - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.