
180
Chapter 10
required. The tendency is invariably to overestimate the requirement.
It is not at all unusual to have a problem with hundreds of input
neurons and several output neurons which only requires five or so
hidden neurons. When in doubt, start low and work up as needed. It
is unfortunate that using fewer hidden neurons often increases the
likelihood of the learning algorithm becoming trapped in a local
minimum. Additional weights can create new channels through which
gradient descent is able to pursue a global minimum. In practice,
though, the tradeoff is rarely worthwhile. Stick with the minimum
number necessary to solve the problem ...