The batter hits the ball. The outfielder immediately starts running, anticipating the ball’s trajectory. He tracks it, adapts his movements, and finally catches it (under a thunder of applause). Predicting the future is what you do all the time, whether you are finishing a friend’s sentence or anticipating the smell of coffee at breakfast. In this chapter, we will discuss recurrent neural networks (RNNs), a class of nets that can predict the future (well, up to a point, of course). They can analyze time series data such as stock prices, and tell you when to buy or sell. In autonomous driving systems, they can anticipate car trajectories and help avoid accidents. More generally, they can work on sequences of arbitrary lengths, rather than on fixed-sized inputs like all the nets we have considered so far. For example, they can take sentences, documents, or audio samples as input, making them extremely useful for natural language processing (NLP) systems such as automatic translation or speech-to-text.
In this chapter, we will first look at the fundamental concepts underlying RNNs and how to train them using backpropagation through time, then we will use them to forecast a time series. Next, we will explore the two main difficulties that RNNs face:
Unstable gradients (discussed in Chapter 11), which can be alleviated using various techniques, including recurrent dropout and recurrent layer normalization.
A (very) limited short-term ...