July 2018
Intermediate to advanced
334 pages
8h 20m
English
Each feature in a multidimensional dataset is a contributing factor in arriving at a prediction:
Some features are more important than others in contributing to the final prediction. In other words, a (final) prediction is made on what category a new sample belongs to. For example, in the breast cancer dataset in Chapter 2, Build a Breast Cancer Prognosis Pipeline with Spark and Scala, the Random Forest algorithm can be used to estimate feature importance. In the following list the top-most features have ...
Read now
Unlock full access