July 2017
Intermediate to advanced
360 pages
8h 26m
English
When growing a decision tree with a multidimensional dataset, it can be useful to evaluate the importance of each feature in predicting the output values. In Chapter 3, Feature Selection and Feature Engineering, we discussed some methods to reduce the dimensionality of a dataset by selecting only the most significant features. Decision trees offer a different approach based on the impurity reduction determined by every single feature. In particular, considering a feature xi, its importance can be determined as:
The sum is extended to all nodes where xi is used, and Nk is the number of samples reaching the node k. Therefore, ...
Read now
Unlock full access