Deep learning models are often treated as a black box; we pour data in at one end and an answer comes out at the other without us having to care much about how our network learns. While it is true that deep neural nets can be remarkably good at distilling a signal out of complex input data, the flip side of treating these networks as black boxes is that it isn’t always clear what to do when things get stuck.
A common theme among the techniques we discuss here is that we want the network to generalize rather than to memorize. It is worth pondering the question of why neural networks generalize at all. Some of the models described in this book and used in production contain millions of parameters that would allow the network to memorize inputs with very many examples. If everything goes well, though, it doesn’t do this, but rather develops generalized rules about its input.
If things don’t go well, you can try the techniques described in this chapter. We’ll start out by looking at how we know that we’re stuck. We’ll then look at various ways in which we can preprocess our input data to make it easier for the network to work with.
How do you know when your network is stuck?
Look at various metrics while the network trains.
The most common signs that things are not well with a neural network are that the network is not learning anything or that it is learning the wrong thing. When we set up the network, ...