Visualization and variable reduction
In the previous section, the housing data underwent a lot of analytical pre-processing, and we are now ready to further analyze this. First, we begin with visualization. Since we have a lot of variables, the visualization on the R visual device is slightly difficult. As seen in earlier chapters, to visualize the random forests and other large, complex structures, we will initiate a PDF device and store the graphs in it. In the housing dataset, the main variable is the housing price and so we will first name the output variable SalePrice
. We need to visualize the data in a way that facilitates the relationship between the numerous variables and the SalePrice
. The independent variables can be either numeric or ...
Get Hands-On Ensemble Learning with R now with the O’Reilly learning platform.
O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.