Evaluating linear regression

Amazon ML uses the standard metric RMSE for linear regression. RMSE is defined as the sum of the squares of the difference between the real values and the predicted values:

Where ŷ are the predicted values and y the real values. The closer the predictions are to the real values, the lower the RMSE is; therefore, a lower RMSE is interpreted as a better predictive model.

To demonstrate the evaluation in the regression context, we will consider a simplified version of the Airlines delay dataset available on Kaggle at https://www.kaggle.com/giovamata/airlinedelaycauses. The full dataset is quite large (~250Mb). We ...

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.