appendix C Natural language processing
C.1 Touring around the zoo: Meeting other Transformer models
In chapter 13, we discussed a powerful Transformer-based model known as BERT (bidirectional encoder representations from Transformers). But BERT was just the beginning of a wave of Transformer models. These models grew stronger and better, either by solving theoretical issues with BERT or re-engineering various aspects of the model to perform faster and better. Let’s understand some of these popular models to learn what sets them apart from BERT.
C.1.1 Generative pre-training (GPT) model (2018)
The story actually starts even before BERT. OpenAI introduced a model called GPT in the paper “Improving Language Understanding by Generative Pre-Training” ...
Get TensorFlow in Action now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.