13 Transformers

This chapter covers

  • Implementing a full Transformer model with all the components
  • Implementing a spam classifier using a pretrained BERT model from TFHub
  • Implementing a question-answering model using Hugging Face’s Transformer library

In chapters 11 and 12, you learned about sequence-to-sequence models, a powerful family of models that allows us to map an arbitrary-length sequence to another arbitrary-length sequence. We exemplified this ability through a machine translation task. Sequence-to-sequence models consist of an encoder and a decoder. The encoder takes in the input sequence (a sentence in the source language) and creates a compact representation of that (known as the context vector). The decoder takes in the context ...

Get TensorFlow in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.