Gradient descent is a general-purpose optimization algorithm that has a wide variety of applications. Gradient descent minimizes the cost function by iteratively adjusting the model parameters. Gradient descent works by taking the partial derivative of the cost function. If we plot the cost function against a parameter value, it forms a convex function, as shown in the following diagram:
You can see that as we vary θ, from right to left in the preceding diagram, the cost, Jθ, decreases to a minimum and then rises. The aim is that on each iteration of gradient descent, the cost moves closer to the minimum, and then stops once ...