Programming Neural Networks with Python
by Rheinwerk Publishing, Inc, Dr. Joachim Steinwendner, Dr. Roland Schwaiger
7.3 The Optimization Method
As we’ve explained in Chapter 6, the optimization method of choice is a gradient descent method. This is a very unwieldy name for a very natural behavior, comparable to hiking in the mountains with the goal of getting down to the valley very quickly—you always choose the steepest descent.
With the amount of weights to be trained for convolutional or transformer neural networks, training can take a lot of time. This is why we’re always on the lookout for faster gradient-based methods, which can often speed up the training process considerably.
We already know that the weights in the network are constantly undergoing small changes during training in the search for the optimum. In mathematical terms, for our weights ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access