September 2019
Intermediate to advanced
416 pages
13h 49m
English
In this appendix, we use the formal neural network notation from Appendix A to dive into the partial-derivative calculus behind the backpropagation method introduced in Chapter 8.
Let’s begin by defining some additional notation to help us along. Backpropagation works backwards, so the notation is based on the final layer (denoted L), and the earlier layers are annotated with respect to it (L – 1, L – 2, . . . L – n). The weights, biases, and outputs from functions are subscripted appropriately with this same notation. Recall from Equations 7.1 and 7.2 that the layer activation aL is calculated by multiplying the preceding layer’s activation (aL–1) by the weight wL and bias bL terms to produce zL and passing this through an ...