August 2019
Intermediate to advanced
242 pages
5h 45m
English
RNNs themselves are an important architectural innovation, but run into problems in terms of their gradients vanishing. When gradient values become so small that the updates are equally tiny, this slows or even halts learning. Your digital neurons die, and your network doesn't do what you want it to do. But is a neural network with a bad memory better than one with no memory at all?
Let's zoom in a bit and discuss what's actually going on when you run into this problem. Recall the formula for calculating the value for a given weight during backpropagation:
W = W - LR*G
Here, the weight value equals the weight minus (learning rate multiplied by the gradient).
Your network is propagating error derivatives across ...
Read now
Unlock full access