Mastering Transformers

Book description

Take a problem-solving approach to learning all about transformers and get up and running in no time by implementing methodologies that will build the future of NLP

Key Features

  • Explore quick prototyping with up-to-date Python libraries to create effective solutions to industrial problems
  • Solve advanced NLP problems such as named-entity recognition, information extraction, language generation, and conversational AI
  • Monitor your model's performance with the help of BertViz, exBERT, and TensorBoard

Book Description

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.

The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment.

By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.

What you will learn

  • Explore state-of-the-art NLP solutions with the Transformers library
  • Train a language model in any language with any transformer architecture
  • Fine-tune a pre-trained language model to perform several downstream tasks
  • Select the right framework for the training, evaluation, and production of an end-to-end solution
  • Get hands-on experience in using TensorBoard and Weights & Biases
  • Visualize the internal representation of transformer models for interpretability

Who this book is for

This book is for deep learning researchers, hands-on NLP practitioners, as well as ML/NLP educators and students who want to start their journey with Transformers. Beginner-level machine learning knowledge and a good command of Python will help you get the best out of this book.

Table of contents

  1. Mastering Transformers
  2. Contributors
  3. About the authors
  4. About the reviewer
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Code in Action
    6. Download the color images
    7. Conventions used
    8. Get in touch
    9. Share Your Thoughts
  6. Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications
  7. Chapter 1: From Bag-of-Words to the Transformer
    1. Technical requirements
    2. Evolution of NLP toward Transformers
    3. Understanding distributional semantics
      1. BoW implementation
      2. Overcoming the dimensionality problem
      3. Language modeling and generation
    4. Leveraging DL
      1. Learning word embeddings
      2. A brief overview of RNNs
      3. LSTMs and gated recurrent units
      4. A brief overview of CNNs
    5. Overview of the Transformer architecture
      1. Attention mechanism
      2. Multi-head attention mechanisms
    6. Using TL with Transformers
    7. Summary
    8. References
  8. Chapter 2: A Hands-On Introduction to the Subject
    1. Technical requirements
    2. Installing Transformer with Anaconda
      1. Installation on Linux
      2. Installation on Windows
      3. Installation on macOS
      4. Installing TensorFlow, PyTorch, and Transformer
      5. Installing using Google Colab
    3. Working with language models and tokenizers
    4. Working with community-provided models
    5. Working with benchmarks and datasets
      1. Important benchmarks
      2. Accessing the datasets with an Application Programming Interface
    6. Benchmarking for speed and memory
    7. Summary
  9. Section 2: Transformer Models – From Autoencoding to Autoregressive Models
  10. Chapter 3: Autoencoding Language Models
    1. Technical requirements
    2. BERT – one of the autoencoding language models
      1. BERT language model pretraining tasks
      2. A deeper look into the BERT language model
    3. Autoencoding language model training for any language
    4. Sharing models with the community
    5. Understanding other autoencoding models
      1. Introducing ALBERT
      2. RoBERTa
      3. ELECTRA
    6. Working with tokenization algorithms
      1. Byte pair encoding
      2. WordPiece tokenization
      3. Sentence piece tokenization
      4. The tokenizers library
    7. Summary
  11. Chapter 4:Autoregressive and Other Language Models
    1. Technical requirements
    2. Working with AR language models
      1. Introduction and training models with GPT
      2. Transformer-XL
      3. XLNet
    3. Working with Seq2Seq models
      1. T5
      2. Introducing BART
    4. AR language model training
    5. NLG using AR models
    6. Summarization and MT fine-tuning using simpletransformers
    7. Summary
    8. References
  12. Chapter 5: Fine-Tuning Language Models for Text Classification
    1. Technical requirements
    2. Introduction to text classification
    3. Fine-tuning a BERT model for single-sentence binary classification
    4. Training a classification model with native PyTorch
    5. Fine-tuning BERT for multi-class classification with custom datasets
    6. Fine-tuning the BERT model for sentence-pair regression
    7. Utilizing run_glue.py to fine-tune the models
    8. Summary
  13. Chapter 6: Fine-Tuning Language Models for Token Classification
    1. Technical requirements
    2. Introduction to token classification
      1. Understanding NER
      2. Understanding POS tagging
      3. Understanding QA
    3. Fine-tuning language models for NER
    4. Question answering using token classification
    5. Summary
  14. Chapter 7: Text Representation
    1. Technical requirements
    2. Introduction to sentence embeddings
      1. Cross-encoder versus bi-encoder
      2. Benchmarking sentence similarity models
      3. Using BART for zero-shot learning
    3. Semantic similarity experiment with FLAIR
      1. Average word embeddings
      2. RNN-based document embeddings
      3. Transformer-based BERT embeddings
      4. Sentence-BERT embeddings
    4. Text clustering with Sentence-BERT
      1. Topic modeling with BERTopic
    5. Semantic search with Sentence-BERT
    6. Summary
    7. Further reading
  15. Section 3: Advanced Topics
  16. Chapter 8: Working with Efficient Transformers
    1. Technical requirements
    2. Introduction to efficient, light, and fast transformers
    3. Implementation for model size reduction
      1. Working with DistilBERT for knowledge distillation
      2. Pruning transformers
      3. Quantization
    4. Working with efficient self-attention
      1. Sparse attention with fixed patterns
      2. Learnable patterns
      3. Low-rank factorization, kernel methods, and other approaches
    5. Summary
    6. References
  17. Chapter 9:Cross-Lingual and Multilingual Language Modeling
    1. Technical requirements
    2. Translation language modeling and cross-lingual knowledge sharing
    3. XLM and mBERT
      1. mBERT
      2. XLM
    4. Cross-lingual similarity tasks
      1. Cross-lingual text similarity
      2. Visualizing cross-lingual textual similarity
    5. Cross-lingual classification
    6. Cross-lingual zero-shot learning
    7. Fundamental limitations of multilingual models
      1. Fine-tuning the performance of multilingual models
    8. Summary
    9. References
  18. Chapter 10: Serving Transformer Models
    1. Technical requirements
    2. fastAPI Transformer model serving
    3. Dockerizing APIs
    4. Faster Transformer model serving using TFX
    5. Load testing using Locust
    6. Summary
    7. References
  19. Chapter 11: Attention Visualization and Experiment Tracking
    1. Technical requirements
    2. Interpreting attention heads
      1. Visualizing attention heads with exBERT
      2. Multiscale visualization of attention heads with BertViz
      3. Understanding the inner parts of BERT with probing classifiers
    3. Tracking model metrics
      1. Tracking model training with TensorBoard
      2. Tracking model training live with W&B
    4. Summary
    5. References
    6. Why subscribe?
  20. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think
    2. Share Your Thoughts

Product information

  • Title: Mastering Transformers
  • Author(s): Savaş Yıldırım, Meysam Asgari- Chenaghlu
  • Release date: September 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781801077651