Visualization and variable reduction
In the previous section, the housing data underwent a lot of analytical pre-processing, and we are now ready to further analyze this. First, we begin with visualization. Since we have a lot of variables, the visualization on the R visual device is slightly difficult. As seen in earlier chapters, to visualize the random forests and other large, complex structures, we will initiate a PDF device and store the graphs in it. In the housing dataset, the main variable is the housing price and so we will first name the output variable SalePrice
. We need to visualize the data in a way that facilitates the relationship between the numerous variables and the SalePrice
. The independent variables can be either numeric or ...
Get Hands-On Ensemble Learning with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.