In the following exercise, we will work with the famous Boston dataset, which is included in R. The idea is to model the median property value in Boston, in terms of several numeric indicators such as environmental and crime factors:
- First, we have to load the data:
library("olsrr") library(dplyr) model = lm(data=Boston, medv ~ .)
- The first option is to run all possible models. Of course, this method can only be used for small datasets. When we have more than 15–20 variables, it becomes computationally unfeasible, as all possible subsets are used. This approach will always give us the best model, as all of the models are evaluated. Here, we can sort the tibble object by the metric we are interested in (for example, the ...