Backpropagation and automatic differentiation

Computing partial derivatives is a process that's repeated thousands upon thousands of times while training a neural network and for this reason, this process must be as efficient as possible.

In the previous sections, we showed you how, by using a loss function, is it possible to create a bond between the model's output, the input, and the label. If we represent the whole neural network architecture using a graph, it's easy to see how, given an input instance, we are just performing a mathematical operation (input multiplied by a parameter, adding those multiplication results, and applying the non-linearity function to the sum) in an ordinate manner. At the input of this graph, we have the input ...

Get Hands-On Neural Networks with TensorFlow 2.0 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.