August 2018
Intermediate to advanced
438 pages
12h 3m
English
Objective functions for highly non-linear DNNs have extremely steep, regions resembling cliffs, as shown in the following figure. Moving in the direction of a negative gradient of an extremely steep cliff structure can move the weights so far off that we jump off the cliff structure altogether. Thus, missing the minima at a point when we are very close.
Thus, nullifying much of the work that had been done to reach the current solution:

We can avoid such bad moves in gradient descent by clipping the gradient, that is, setting an upper bound on the magnitude of ...
Read now
Unlock full access