Network initialization
One of the most seemingly trivial, yet crucial, aspects of CNN training is network initialization. Every CNN layer has certain parameters or weights that get trained over the training set. The most popular algorithm to learn this optimal weight is SGD. Inputs to SGD include an initial set of weights, a loss function, and labeled training data. SGD will use the initial weights to compute a loss value given the labels in the training data and adjust its weight to reduce the loss. This adjusted weight will now be fed to the next iteration where the previous process continues until convergence is achieved. As can be seen from this process, the choice of initial weight for network initialization plays a crucial role on the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access