Choosing an algorithm

For this task, we will use xgboost, which is a very popular implementation of the gradient tree boosting algorithm. The reason this works so well is that each model iteration learns from the results of the previous model. This model uses boosting for iterative learning in contrast to bagging. Both of these ensembling techniques can be used to compensate for a known weakness in tree-based learners, which has to do with overfitting to the training data.

One simple difference between bagging and boosting is that, with bagging, full trees are grown and then the results are averaged, while, with boosting, each iteration of the tree model learns from the model before it. This is an important concept, as this idea of an algorithm ...

Get Hands-On Deep Learning with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.