Applying Backpropagation

Here is our three-layered network, in the form of a computational graph:

images/training/backprop_network_1.png

All the variables in the preceding graph are matrices, with the exception of the loss L. The σ symbol represents the sigmoid. For reasons that will become clear in a minute, I squashed the softmax and the cross-entropy loss into one operation. I needed a name for that operation, so I temporarily called it SML, for “softmax and loss.” Finally, I gave the names a and b to the outputs of the matrix multiplications.

The diagram shown earlier represents the same neural network that we designed and built in the previous two chapters. Follow its operations ...

Get Programming Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.