appendix C Natural language processing

C.1 Touring around the zoo: Meeting other Transformer models

In chapter 13, we discussed a powerful Transformer-based model known as BERT (bidirectional encoder representations from Transformers). But BERT was just the beginning of a wave of Transformer models. These models grew stronger and better, either by solving theoretical issues with BERT or re-engineering various aspects of the model to perform faster and better. Let’s understand some of these popular models to learn what sets them apart from BERT.

C.1.1 Generative pre-training (GPT) model (2018)

The story actually starts even before BERT. OpenAI introduced a model called GPT in the paper “Improving Language Understanding by Generative Pre-Training” ...

Get TensorFlow in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

TensorFlow in Action by Thushan Ganegedara

appendix C Natural language processing

C.1 Touring around the zoo: Meeting other Transformer models

C.1.1 Generative pre-training (GPT) model (2018)

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly