Gradient descent

Gradient descent is a general-purpose optimization algorithm that has a wide variety of applications. Gradient descent minimizes the cost function by iteratively adjusting the model parameters. Gradient descent works by taking the partial derivative of the cost function. If we plot the cost function against a parameter value, it forms a convex function, as shown in the following diagram:

You can see that as we vary θ, from right to left in the preceding diagram, the cost, Jθ, decreases to a minimum and then rises. The aim is that on each iteration of gradient descent, the cost moves closer to the minimum, and then stops once ...

Get Deep Learning with PyTorch Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.