How it works...
In this example, we used the built-in dataset of IMDb reviews from the keras library. We loaded the training and testing partitions of the data and had a look at the structure of these data partitions. We saw that the data had been mapped to a specific sequence of integer values, each integer representing a particular word in a dictionary. This dictionary has a rich collection of words arranged based on the frequency of each word getting used in the corpus. From this, we could see that the dictionary is a list of key-value pairs, with the keys representing the words and the values representing the index of the word in the dictionary. To discard the words that are not frequently used, we provided a threshold of 1,000; that ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access