Stochastic gradient descent algorithms

After discussing the basics of logistic regression, it's useful to introduce the SGDClassifier class, which implements a very common algorithm that can be applied to several different loss functions. The idea behind SGD is to minimize a cost function by iterating a weight update based on the gradient:

However, instead of considering the whole dataset, the update procedure is applied on batches randomly extracted from it (for this reason, it is often also called mini-batch gradient descent). In the preceding formula, L is the cost function we want to minimize with respect to the parameters (as discussed ...

Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.