Let's move on to building our model. We will start by identifying our numerical and categorical variables. We study the correlations using the correlation matrix and the correlation plots.
- First, we'll take a look at the variables and the variable types:
# See the variables and their data typesdf_housingdata.dtypes
- We'll then look at the correlation matrix. The corr() method computes the pairwise correlation of columns:
# We pass 'pearson' as the method for calculating our correlationdf_housingdata.corr(method='pearson')
- Besides this, we'd also like to study the correlation between the predictor variables and the response variable:
# we store the correlation matrix output in a variablepearson = df_housingdata.corr(method='pearson') ...