10

Transformers

Transformer models changed the playing field for most machine learning problems that involve sequential data. They have advanced the state of the art by a significant margin compared to the previous leaders, RNN-based models. One of the primary reasons that the Transformer model is so performant is that it has access to the whole sequence of items (e.g. sequence of tokens), as opposed to RNN-based models, which look at one item at a time. The term Transformer has come up several times in our conversations as a method that has outperformed other sequential models such as LSTMs and GRUs. Now, we will learn more about Transformer models.

In this chapter, we will first learn about the Transformer model in detail. Then we will discuss ...

Get Natural Language Processing with TensorFlow - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.