At the core of linear regression, there is the search for a line's equation that it is able to minimize the sum of the squared errors of the difference between the line's *y* values and the original ones. As a reminder, let's say our regression function is called `h`

, and its predictions `h(X)`

, as in this formulation:

Consequently, our cost function to be minimized is as follows:

There are quite a few methods to minimize it, some performing better than others in the presence of large quantities of data. Among the better performers, ...

Start Free Trial

No credit card required