Transfer Learning for Natural Language Processing

Book description

Build custom NLP models in record time by adapting pre-trained machine learning models to solve specialized problems.

In Transfer Learning for Natural Language Processing you will learn:
  • Fine tuning pretrained models with new domain data
  • Picking the right model to reduce resource usage
  • Transfer learning for neural network architectures
  • Generating text with generative pretrained transformers
  • Cross-lingual transfer learning with BERT
  • Foundations for exploring NLP academic literature

Training deep learning NLP models from scratch is costly, time-consuming, and requires massive amounts of data. In Transfer Learning for Natural Language Processing, DARPA researcher Paul Azunre reveals cutting-edge transfer learning techniques that apply customizable pretrained models to your own NLP architectures. You’ll learn how to use transfer learning to deliver state-of-the-art results for language comprehension, even when working with limited label data. Best of all, you’ll save on training time and computational costs.

About the Technology
Build custom NLP models in record time, even with limited datasets! Transfer learning is a machine learning technique for adapting pretrained machine learning models to solve specialized problems. This powerful approach has revolutionized natural language processing, driving improvements in machine translation, business analytics, and natural language generation.

About the Book
Transfer Learning for Natural Language Processing teaches you to create powerful NLP solutions quickly by building on existing pretrained models. This instantly useful book provides crystal-clear explanations of the concepts you need to grok transfer learning along with hands-on examples so you can practice your new skills immediately. As you go, you’ll apply state-of-the-art transfer learning methods to create a spam email classifier, a fact checker, and more real-world applications.

What's Inside
  • Fine tuning pretrained models with new domain data
  • Picking the right model to reduce resource use
  • Transfer learning for neural network architectures
  • Generating text with pretrained transformers


About the Reader
For machine learning engineers and data scientists with some experience in NLP.

About the Author
Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA research programs.

Quotes
For anyone looking to dive deep into recent breakthroughs in NLP & transfer learning, this book is for you!
- Matthew Sarmiento, Plume Design

Does an excellent job of introducing the techniques and concepts to NLP practitioners.
- Marc-Anthony Taylor, Blackshark.ai

Keep this book handy if you want to be good at applied and real-world NLP.
- Sayak Paul, PyImageSearch

Good fundamentals of transfer learning for NLP applications. Sets you up for success!
- Vamsi Sistla, Data Science & ML Consultant

Table of contents

  1. Transfer Learning for Natural Language Processing
  2. Copyright
  3. dedication
  4. contents
  5. front matter
    1. preface
    2. acknowledgments
    3. about this book
    4. Who should read this book?
    5. Road map
    6. Software requirements
    7. About the code
    8. liveBook discussion forum
    9. about the author
    10. about the cover illustration
  6. Part 1 Introduction and overview
  7. 1 What is transfer learning?
    1. 1.1 Overview of representative NLP tasks
    2. 1.2 Understanding NLP in the context of AI
      1. 1.2.1 Artificial intelligence (AI)
      2. 1.2.2 Machine learning
      3. 1.2.3 Natural language processing (NLP)
    3. 1.3 A brief history of NLP advances
      1. 1.3.1 General overview
      2. 1.3.2 Recent transfer learning advances
    4. 1.4 Transfer learning in computer vision
      1. 1.4.1 General overview
      2. 1.4.2 Pretrained ImageNet models
      3. 1.4.3 Fine-tuning pretrained ImageNet models
    5. 1.5 Why is NLP transfer learning an exciting topic to study now?
    6. Summary
  8. 2 Getting started with baselines: Data preprocessing
    1. 2.1 Preprocessing email spam classification example data
      1. 2.1.1 Loading and visualizing the Enron corpus
      2. 2.1.2 Loading and visualizing the fraudulent email corpus
      3. 2.1.3 Converting the email text into numbers
    2. 2.2 Preprocessing movie sentiment classification example data
    3. 2.3 Generalized linear models
      1. 2.3.1 Logistic regression
      2. 2.3.2 Support vector machines (SVMs)
    4. Summary
  9. 3 Getting started with baselines: Benchmarking and optimization
    1. 3.1 Decision-tree-based models
      1. 3.1.1 Random forests (RFs)
      2. 3.1.2 Gradient-boosting machines (GBMs)
    2. 3.2 Neural network models
      1. 3.2.1 Embeddings from Language Models (ELMo)
      2. 3.2.2 Bidirectional Encoder Representations from Transformers (BERT)
    3. 3.3 Optimizing performance
      1. 3.3.1 Manual hyperparameter tuning
      2. 3.3.2 Systematic hyperparameter tuning
    4. Summary
  10. Part 2 Shallow transfer learning and deep transfer learning with recurrent neural networks (RNNs)
  11. 4 Shallow transfer learning for NLP
    1. 4.1 Semisupervised learning with pretrained word embeddings
    2. 4.2 Semisupervised learning with higher-level representations
    3. 4.3 Multitask learning
      1. 4.3.1 Problem setup and a shallow neural single-task baseline
      2. 4.3.2 Dual-task experiment
    4. 4.4 Domain adaptation
    5. Summary
  12. 5 Preprocessing data for recurrent neural network deep transfer learning experiments
    1. 5.1 Preprocessing tabular column-type classification data
      1. 5.1.1 Obtaining and visualizing tabular data
      2. 5.1.2 Preprocessing tabular data
      3. 5.1.3 Encoding preprocessed data as numbers
    2. 5.2 Preprocessing fact-checking example data
      1. 5.2.1 Special problem considerations
      2. 5.2.2 Loading and visualizing fact-checking data
    3. Summary
  13. 6 Deep transfer learning for NLP with recurrent neural networks
    1. 6.1 Semantic Inference for the Modeling of Ontologies (SIMOn)
      1. 6.1.1 General neural architecture overview
      2. 6.1.2 Modeling tabular data
      3. 6.1.3 Application of SIMOn to tabular column-type classification data
    2. 6.2 Embeddings from Language Models (ELMo)
      1. 6.2.1 ELMo bidirectional language modeling
      2. 6.2.2 Application to fake news detection
    3. 6.3 Universal Language Model Fine-Tuning (ULMFiT)
      1. 6.3.1 Target task language model fine-tuning
      2. 6.3.2 Target task classifier fine-tuning
    4. Summary
  14. Part 3 Deep transfer learning with transformers and adaptation strategies
  15. 7 Deep transfer learning for NLP with the transformer and GPT
    1. 7.1 The transformer
      1. 7.1.1 An introduction to the transformers library and attention visualization
      2. 7.1.2 Self-attention
      3. 7.1.3 Residual connections, encoder-decoder attention, and positional encoding
      4. 7.1.4 Application of pretrained encoder-decoder to translation
    2. 7.2 The Generative Pretrained Transformer
      1. 7.2.1 Architecture overview
      2. 7.2.2 Transformers pipelines introduction and application to text generation
      3. 7.2.3 Application to chatbots
    3. Summary
  16. 8 Deep transfer learning for NLP with BERT and multilingual BERT
    1. 8.1 Bidirectional Encoder Representations from Transformers (BERT)
      1. 8.1.1 Model architecture
      2. 8.1.2 Application to question answering
      3. 8.1.3 Application to fill in the blanks and next-sentence prediction tasks
    2. 8.2 Cross-lingual learning with multilingual BERT (mBERT)
      1. 8.2.1 Brief JW300 dataset overview
      2. 8.2.2 Transfer mBERT to monolingual Twi data with the pretrained tokenizer
      3. 8.2.3 mBERT and tokenizer trained from scratch on monolingual Twi data
    3. Summary
  17. 9 ULMFiT and knowledge distillation adaptation strategies
    1. 9.1 Gradual unfreezing and discriminative fine-tuning
      1. 9.1.1 Pretrained language model fine-tuning
      2. 9.1.2 Target task classifier fine-tuning
    2. 9.2 Knowledge distillation
      1. 9.2.1 Transfer DistilmBERT to monolingual Twi data with pretrained tokenizer
    3. Summary
  18. 10 ALBERT, adapters, and multitask adaptation strategies
    1. 10.1 Embedding factorization and cross-layer parameter sharing
      1. 10.1.1 Fine-tuning pretrained ALBERT on MDSD book reviews
    2. 10.2 Multitask fine-tuning
      1. 10.2.1 General Language Understanding Dataset (GLUE)
      2. 10.2.2 Fine-tuning on a single GLUE task
      3. 10.2.3 Sequential adaptation
    3. 10.3 Adapters
    4. Summary
  19. 11 Conclusions
    1. 11.1 Overview of key concepts
    2. 11.2 Other emerging research trends
      1. 11.2.1 RoBERTa
      2. 11.2.2 GPT-3
      3. 11.2.3 XLNet
      4. 11.2.4 BigBird
      5. 11.2.5 Longformer
      6. 11.2.6 Reformer
      7. 11.2.7 T5
      8. 11.2.8 BART
      9. 11.2.9 XLM
      10. 11.2.10 TAPAS
    3. 11.3 Future of transfer learning in NLP
    4. 11.4 Ethical and environmental considerations
    5. 11.5 Staying up-to-date
      1. 11.5.1 Kaggle and Zindi competitions
      2. 11.5.2 arXiv
      3. 11.5.3 News and social media (Twitter)
    6. 11.6 Final words
    7. Summary
  20. appendix A Kaggle primer
    1. A.1 Free GPUs with Kaggle kernels
    2. A.2 Competitions, discussion, and blog
  21. appendix B Introduction to fundamental deep learning tools
    1. B.1 Stochastic gradient descent
    2. B.2 TensorFlow
    3. B.3 PyTorch
    4. B.4 Keras, fast.ai, and Transformers by Hugging Face
  22. index

Product information

  • Title: Transfer Learning for Natural Language Processing
  • Author(s): Paul Azunre
  • Release date: August 2021
  • Publisher(s): Manning Publications
  • ISBN: 9781617297267