Backward propagation

We minimize the loss function using the gradient descent algorithm. So, we backpropagate the network, calculate the gradient of the loss function with respect to weights, and update the weights. We have two sets of weights, input to hidden layer weight and hidden to output layer weights . We calculate gradients of loss with respect to both of these weights and update them according to the weight update rule:

In order to ...

Get Hands-On Deep Learning Algorithms with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.