Implementing a decision tree from scratch

With a solid understanding of partitioning evaluation metrics, let's practice the CART tree algorithm by hand on a toy dataset:

To begin, we decide on the first splitting point, the root, by trying out all possible values for each of the two features. We utilize the weighted_impurity function we just defined to calculate the weighted Gini Impurity for each possible combination as follows:

Gini(interest, tech) = weighted_impurity([[1, 1, 0], [0, 0, 0, 1]]) = 0.405 Gini(interest, Fashion) = weighted_impurity([[0, 0], [1, 0, 1, 0, 1]]) = 0.343 Gini(interest, Sports) = weighted_impurity([[0, 1], [1, 0, ...

Get Python Machine Learning By Example - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.