Chapter 7. Recurrent Neural Networks for Natural Language Processing
In Chapter 5, you saw how to tokenize and sequence text, turning sentences into tensors of numbers that could then be fed into a neural network. You then extended that in Chapter 6 by looking at embeddings, which constitute a way to have words with similar meanings cluster together to enable the calculation of sentiment. This worked really well, as you saw by building a sarcasm classifier. But there’s a limitation to that: namely, sentences aren’t just collections of words—and often, the order in which the words appear will dictate their overall meaning. Also, adjectives can add to or change the meaning of the nouns they appear beside. For example, the word blue might be meaningless from a sentiment perspective, as might sky, but when you put them together to get blue sky, it indicates a clear sentiment that’s usually positive. Finally, some nouns may qualify others, such as in rain cloud, writing desk, and coffee mug.
To take sequences like this into account, you need to take an additional approach: you need to factor recurrence into the model architecture. In this chapter, you’ll look at different ways of doing this. We’ll explore how sequence information can be learned and how you can use this information to create a type of model that is better able to understand text: the recurrent neural network (RNN).
The Basis of Recurrence
To understand how recurrence might work, let’s first consider the limitations ...