February 2019
Intermediate to advanced
260 pages
6h 3m
English
We are already aware that we have no weight values at the beginning. In order to solve that problem, we will initialize the weights with random non-zero values. This might work well, but here, we are going to look at how initializing weights greatly impacts the learning time of our neural network.
Suppose we have a deep neural network with many hidden layers, and each of these high layers is connected to two neurons. For the sake of simplicity, we'll not take the sigmoid function but the identity activation function, which simply leaves the input untouched. The value is given by F(z), or simply Z:

Assume that we have ...
Read now
Unlock full access