O'Reilly logo

Mastering Machine Learning with R - Second Edition by Cory Lesmeister

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Extreme gradient boosting - classification

As mentioned previously, we will be using the xgboost package in this section, which we have already loaded. Given the method's well-earned reputation, let's try it on the diabetes data.

As stated in the boosting overview, we will be tuning a number of parameters:

  • nrounds: The maximum number of iterations (number of trees in final model).
  • colsample_bytree: The number of features, expressed as a ratio, to sample when building a tree. Default is 1 (100% of the features).
  • min_child_weight: The minimum weight in the trees being boosted. Default is 1.
  • eta: Learning rate, which is the contribution of each tree to the solution. Default is 0.3.
  • gamma: Minimum loss reduction required to make another leaf ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required