In addition to examining the residuals, we should calculate metrics to evaluate our regression model. Perhaps the most common is R2 (pronounced R-squared), or the coefficient of determination, which quantifies the proportion of variance in the dependent variable that we can predict from our independent variables. It is calculated by subtracting the ratio of the sum of squared residuals to the total sum of squares from one:

Sigma (Σ) represents the sum. The average of the y values is denoted as ȳ (pronounced y-bar). The predictions are denoted with ŷ (pronounced y-hat).

This value will be in [0, 1], where higher values are better. Objects ...

Get Hands-On Data Analysis with Pandas now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.