July 2019
Beginner to intermediate
298 pages
7h 20m
English
In this section, we will classify the dataset using bagging. As we have previously shown, decision trees with maximum depth of five are optimal thus, we will use these trees for our bagging example.
We would like to optimize the ensemble's size. We will generate validation curves for the original train set by testing sizes in the range of [5, 30]. The actual curves are depicted here in the following graph:

We observe that variance is minimized for an ensemble size of 10, thus we will utilize ensembles of size 10.
The following code loads the data and libraries ...
Read now
Unlock full access