May 2019
Intermediate to advanced
162 pages
4h 24m
English
What we're interested in doing with decision trees is defining a flexible extensible algorithm that can achieve the decision tree. This is where the Classification and Regression Trees (CART) algorithm comes in. CART is generalizable to either task and it learns, essentially, by asking questions of the data. At each split point, CART will scan the entire feature space, sampling values from each feature to identify the best feature and value for the split. It does this by evaluating the information gain formula, which seeks to maximize a gain in purity in the split, which is pretty intuitive. Gini Impurity is computed at the leaf level, and is a way of measuring how pure or impure a leaf is; its formula is ...
Read now
Unlock full access