Chapter 10. NLP Deep Dive: RNNs
In Chapter 1, we saw that deep learning can be used to get great results with natural language datasets. Our example relied on using a pretrained language model and fine-tuning it to classify reviews. That example highlighted a difference between transfer learning in NLP and computer vision: in general, in NLP the pretrained model is trained on a different task.
What we call a language model is a model that has been trained to guess the next word in a text (having read the ones before). This kind of task is called self-supervised learning: we do not need to give labels to our model, just feed it lots and lots of texts. It has a process to automatically get labels from the data, and this task isn’t trivial: to properly guess the next word in a sentence, the model will have to develop an understanding of the English (or other) language. Self-supervised learning can also be used in other domains; for instance, see “Self-Supervised Learning and Computer Vision” for an introduction to vision applications. Self-supervised learning is not usually used for the model that is trained directly, but instead is used for pretraining a model used for transfer learning.
Jargon: Self-Supervised Learning
Training a model using labels that are embedded in the independent variable, rather than requiring external labels. For instance, training a model to predict the next word in a text.
The language model we used in Chapter 1 to classify IMDb reviews was pretrained ...