August 2018
Intermediate to advanced
522 pages
12h 45m
English
When growing a Decision Tree with a multidimensional dataset, it can be useful to evaluate the importance of each feature in predicting the output values. In Chapter 3, Feature Selection and Feature Engineering, we discussed some methods to reduce the dimensionality of a dataset by selecting only the most significant features. Decision Trees offer a different approach based on the impurity reduction determined by every single feature. In particular, considering a feature, x(i), its importance can be determined as follows:

The sum is extended to all nodes where x(i) is used, and Nk is the number of samples reaching the node, ...
Read now
Unlock full access