
Neural networks 261
But this last sum over the K compone nts of the label ℓ(ν) is just unity, and
therefore we have
−δ
o
k
(ν) = −ℓ
k
(ν) + m
k
(ν), k = 1 . . . K,
which may be written as the K-component vector
δ
o
(ν) = ℓ(ν) − m(ν). (6.38)
From Equation (6.36), we can ther efo re express the third step in the back-
propagation algorithm in the form of the matrix equatio n (see Exercise 6)
W
o
(ν + 1) = W
o
(ν) + η n(ν)δ
o
(ν)
⊤
. (6.39)
Here W
o
(ν + 1) indicates the synaptic weight matrix after the up date for νth
training pair. Note that the second term on the right-hand side of Equation
(6.39) is an o uter product, yielding a matrix of dimension (L + 1) ×K and so
matching ...