Initializing the Weights

Let me wax nostalgic about perceptrons for a moment. Back in Part I of this book, weight initialization was a quick job: we just set all the weights to 0. By contrast, weight initialization in a neural network comes with a hard-to-spot pitfall. Let’s describe that pitfall, and see how to walk around it.

Fearful Symmetry

Here is one rule to keep in mind: never initialize all the weights in a neural network with the same value. The reason for that recommendation is subtle, and comes from the matrix multiplications in the network. For example, look at this matrix multiplication:

images/training/symmetry.png

You don’t need to remember the details of matrix ...

Get Programming Machine Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.