May 2019
Intermediate to advanced
664 pages
15h 41m
English
What we're going to do now is use the Information package to calculate the IVs for our features. Then, I'll show you how to evaluate those values and run some plots as well. Since there are no hard and fast rules about thresholds for feature inclusion, I'll provide my judgment about where to draw the line. Of course, you can reject that and apply your own.
In this example, the code will create a series of tables you can use to explore the results. To get started, you only need to specify the data and the response or "y" variable:
IV <- Information::create_infotables(data = train, y = "y", parallel = FALSE)
This will give us an IV summary of the top 25 features:
> knitr::kable(head(IV$Summary, 25))| |Variable | IV||:---|:--------|------:| ...