May 2019
Intermediate to advanced
162 pages
4h 24m
English
In the previous section, we computed the information gained for a given split. Recall that it's computed or calculated by computing the Gini impurity for the parent node in each LeafNode. A higher information again is better, which means we have successfully reduced the impurities of the child nodes with our split. However, we need to know how a candidate split is produced to be evaluated.
For each split, beginning with the root, the algorithm will scan all the features in the data, selecting a random number of values for each. There are various strategies to select these values. For the general use case, we will describe and select a k random approach:
Read now
Unlock full access