In multiple examples (including those in Chapter 4, Regression and Chapter 5, Classification), we took advantage of an optimization technique called gradient descent. There are multiple variants of the gradient descent method, and, in general, you will see them pretty much everywhere in the machine learning world. Most prominently, they are utilized in the determination of optimal coefficients for algorithms such as linear or logistic regression, and thus, they often also play a role in more complicated techniques at least partially based on linear/logistic regression (such as neural networks).
The general idea of gradient descent methods is to determine a direction and magnitude of change in some parameters that will move ...