January 2019
Intermediate to advanced
390 pages
9h 16m
English
We learned about gradient descent and how it works in earlier chapters, and we saw that the search direction is the direction of the gradient descent, -∇f(x). It is also called the Cauchy method because it was given by Cauchy, in 1847, and since then it has been very popular. We start from an arbitrary point on the objective function surface and change the variables (in earlier chapters, these were the weights and biases) along the direction of the gradient. Mathematically, it is represented as follows:

Here αn is the step size (variation/learning rate) at iteration n. Gradient descent algorithms have worked well in ...
Read now
Unlock full access