In this final chapter, we’ll slow down a bit and consider gradient descent afresh. We’ll begin by reviewing the idea of gradient descent using illustrations, discussing what it is and how it works. Next, we’ll explore the meaning of stochastic in stochastic gradient descent. Gradient descent is a simple algorithm that invites tweaking, so after we explore stochastic gradient descent, we’ll consider a useful and commonly used tweak: momentum. We’ll conclude the chapter by discussing more advanced, adaptive gradient descent algorithms, specifically RMSprop, Adagrad, and Adam.

This is a math book, but gradient descent is very much ...

Get Math for Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.