In this recipe, we will focus on removing irrelevant features from our models, extracting the importance for each feature, and how to use recursive elimination to select features from a preliminary model (this can be used later in a secondary model containing the right features).
We will work with the Boston dataset. The idea of this dataset is to predict the median property price, based on environmental variables, crime indexes, and so on. We will use a random forest model. Here, we will follow a manual approach for feature selection, where we will train a model, get each feature importance, and build a final model:
- First, we load the Boston dataset, we define the control and the grid for the model, and then we train the ...