It is usually very helpful to complement the numeric analysis of metrics with visualizations that will help us understand the predictions and the mistakes the model is making. The first thing we can do is to take a look at the distribution of the residuals:
eval_df["residuals"].hist(bins=25, ec='k');
The output will be as follows:
We see that most of the residuals are within 2,000 dollars and that they are more or less evenly distributed; however, we see that a high proportion of them are between -1,000 and 0. Let's calculate how many of the residuals are negative (meaning the model is ...