18 Optimizing Neural Networks

In this chapter, we're going to discuss the most important optimization algorithms that have been derived from the basic Stochastic Gradient Descent (SGD) approach. This method can be quite ineffective when working with very high-dimensional functions, forcing the models to remain stuck in sub-optimal solutions. The optimizers discussed in this chapter have the goals of speeding up convergence and avoiding any sub-optimality. Moreover, we'll also discuss how to apply L₁ and L₂ regularization to a layer of a deep neural network, and how to avoid overfitting using these advanced approaches.

In particular, the topics covered in the chapter are as follows:

Optimized SGD algorithms (Momentum, RMSProp, Adam, AdaGrad, ...

Get Mastering Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mastering Machine Learning Algorithms - Second Edition by Giuseppe Bonaccorso

18

Optimizing Neural Networks

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly