April 2018
Intermediate to advanced
334 pages
10h 18m
English
The following factors call for the application of xavier initialization:
If the weights in a network start very small, most of the signals will shrink and become dormant at the activation function in the later layers
If the weights start very large, most of the signals will massively grow and pass through the activation functions in the later layers
Thus, xavier initialization helps in generating optimal weights, such that the signals are within optimal range, thereby minimizing the chances of the signals getting neither too small nor too large.
The derivation of the preceding formula is beyond the scope of this book. Feel free to search here (http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization ...