December 2018
Beginner to intermediate
684 pages
21h 9m
English
The least squares method is the original method to learn the parameters of the hyperplane that best approximates the output from the input data. As the name suggests, the best approximation minimizes the sum of the squared distances between the output value and the hyperplane represented by the model.
The difference between the model's prediction and the actual outcome for a given data point is the residual (whereas the deviation of the true model from the true output in the population is called error). Hence, in formal terms, the least squares estimation method chooses the coefficient vector
to minimize the residual sum of squares ...