Advanced Natural Language Processing with TensorFlow 2

Book description

One-stop solution for NLP practitioners, ML developers, and data scientists to build effective NLP systems that can perform real-world complicated tasks

Key Features

  • Apply deep learning algorithms and techniques such as BiLSTMS, CRFs, BPE and more using TensorFlow 2
  • Explore applications like text generation, summarization, weakly supervised labelling and more
  • Read cutting edge material with seminal papers provided in the GitHub repository with full working code

Book Description

Recently, there have been tremendous advances in NLP, and we are now moving from research labs into practical applications. This book comes with a perfect blend of both the theoretical and practical aspects of trending and complex NLP techniques.

The book is focused on innovative applications in the field of NLP, language generation, and dialogue systems. It helps you apply the concepts of pre-processing text using techniques such as tokenization, parts of speech tagging, and lemmatization using popular libraries such as Stanford NLP and SpaCy. You will build Named Entity Recognition (NER) from scratch using Conditional Random Fields and Viterbi Decoding on top of RNNs.

The book covers key emerging areas such as generating text for use in sentence completion and text summarization, bridging images and text by generating captions for images, and managing dialogue aspects of chatbots. You will learn how to apply transfer learning and fine-tuning using TensorFlow 2.

Further, it covers practical techniques that can simplify the labelling of textual data. The book also has a working code that is adaptable to your use cases for each tech piece.

By the end of the book, you will have an advanced knowledge of the tools, techniques and deep learning architecture used to solve complex NLP problems.

What you will learn

  • Grasp important pre-steps in building NLP applications like POS tagging
  • Use transfer and weakly supervised learning using libraries like Snorkel
  • Do sentiment analysis using BERT
  • Apply encoder-decoder NN architectures and beam search for summarizing texts
  • Use Transformer models with attention to bring images and text together
  • Build apps that generate captions and answer questions about images using custom Transformers
  • Use advanced TensorFlow techniques like learning rate annealing, custom layers, and custom loss functions to build the latest DeepNLP models

Who this book is for

This is not an introductory book and assumes the reader is familiar with basics of NLP and has fundamental Python skills, as well as basic knowledge of machine learning and undergraduate-level calculus and linear algebra.

The readers who can benefit the most from this book include intermediate ML developers who are familiar with the basics of supervised learning and deep learning techniques and professionals who already use TensorFlow/Python for purposes such as data science, ML, research, analysis, etc.

Table of contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  2. Essentials of NLP
    1. A typical text processing workflow
    2. Data collection and labeling
      1. Collecting labeled data
        1. Development environment setup
      2. Enabling GPUs on Google Colab
    3. Text normalization
      1. Modeling normalized data
      2. Tokenization
        1. Segmentation in Japanese
        2. Modeling tokenized data
      3. Stop word removal
        1. Modeling data with stop words removed
      4. Part-of-speech tagging
        1. Modeling data with POS tagging
      5. Stemming and lemmatization
    4. Vectorizing text
      1. Count-based vectorization
        1. Modeling after count-based vectorization
      2. Term Frequency-Inverse Document Frequency (TF-IDF)
        1. Modeling using TF-IDF features
      3. Word vectors
        1. Pretrained models using Word2Vec embeddings
    5. Summary
  3. Understanding Sentiment in Natural Language with BiLSTMs
    1. Natural language understanding
    2. Bi-directional LSTMs – BiLSTMs
      1. RNN building blocks
      2. Long short-term memory (LSTM) networks
      3. Gated recurrent units (GRUs)
      4. Sentiment classification with LSTMs
        1. Loading the data
        2. Normalization and vectorization
        3. LSTM model with embeddings
        4. BiLSTM model
    3. Summary
  4. Named Entity Recognition (NER) with BiLSTMs, CRFs, and Viterbi Decoding
    1. Named Entity Recognition
      1. The GMB data set
    2. Loading the data
    3. Normalizing and vectorizing data
    4. A BiLSTM model
    5. Conditional random fields (CRFs)
    6. NER with BiLSTM and CRFs
      1. Implementing the custom CRF layer, loss, and model
        1. A custom CRF model
        2. A custom loss function for NER using a CRF
      2. Implementing custom training
    7. Viterbi decoding
      1. The probability of the first word label
    8. Summary
  5. Transfer Learning with BERT
    1. Transfer learning overview
      1. Types of transfer learning
        1. Domain adaptation
        2. Multi-task learning
        3. Sequential learning
    2. IMDb sentiment analysis with GloVe embeddings
      1. GloVe embeddings
      2. Loading IMDb training data
      3. Loading pre-trained GloVe embeddings
      4. Creating a pre-trained embedding matrix using GloVe
      5. Feature extraction model
      6. Fine-tuning model
    3. BERT-based transfer learning
      1. Encoder-decoder networks
      2. Attention model
      3. Transformer model
      4. The bidirectional encoder representations from transformers (BERT) model
        1. Tokenization and normalization with BERT
        2. Pre-built BERT classification model
        3. Custom model with BERT
    4. Summary
  6. Generating Text with RNNs and GPT-2
    1. Generating text – one character at a time
      1. Data loading and pre-processing
      2. Data normalization and tokenization
      3. Training the model
      4. Implementing learning rate decay as custom callback
      5. Generating text with greedy search
    2. Generative Pre-Training (GPT-2) model
      1. Generating text with GPT-2
    3. Summary
  7. Text Summarization with Seq2seq Attention and Transformer Networks
    1. Overview of text summarization
    2. Data loading and pre-processing
    3. Data tokenization and vectorization
    4. Seq2seq model with attention
      1. Encoder model
      2. Bahdanau attention layer
      3. Decoder model
    5. Training the model
    6. Generating summaries
      1. Greedy search
      2. Beam search
      3. Decoding penalties with beam search
    7. Evaluating summaries
    8. ROUGE metric evaluation
    9. Summarization – state of the art
    10. Summary
  8. Multi-Modal Networks and Image Captioning with ResNets and Transformer Networks
    1. Multi-modal deep learning
      1. Vision and language tasks
    2. Image captioning
    3. MS-COCO dataset for image captioning
    4. Image processing with CNNs and ResNet50
      1. CNNs
        1. Convolutions
        2. Pooling
        3. Regularization with dropout
        4. Residual connections and ResNets
    5. Image feature extraction with ResNet50
    6. The Transformer model
      1. Positional encoding and masks
      2. Scaled dot-product and multi-head attention
      3. VisualEncoder
      4. Decoder
      5. Transformer
    7. Training the Transformer model with VisualEncoder
      1. Loading training data
      2. Instantiating the Transformer model
      3. Custom learning rate schedule
      4. Loss and metrics
      5. Checkpoints and masks
      6. Custom training
    8. Generating captions
    9. Improving performance and state-of-the-art models
    10. Summary
  9. Weakly Supervised Learning for Classification with Snorkel
    1. Weak supervision
      1. Inner workings of weak supervision with labeling functions
    2. Using weakly supervised labels to improve IMDb sentiment analysis
      1. Pre-processing the IMDb dataset
      2. Learning a subword tokenizer
      3. A BiLSTM baseline model
        1. Tokenization and vectorizing data
        2. Training using a BiLSTM model
    3. Weakly supervised labeling with Snorkel
      1. Iterating on labeling functions
    4. Naïve-Bayes model for finding keywords
      1. Evaluating weakly supervised labels on the training set
      2. Generating unsupervised labels for unlabeled data
      3. Training BiLSTM on weakly supervised data from Snorkel
    5. Summary
  10. Building Conversational AI Applications with Deep Learning
    1. Overview of conversational agents
      1. Task-oriented or slot-filling systems
    2. Question-answering and MRC conversational agents
    3. General conversational agents
    4. Summary
    5. Epilogue
  11. Installation and Setup Instructions for Code
    1. GitHub location
    2. Chapter 1 installation instructions
    3. Chapter 2 installation instructions
    4. Chapter 3 installation instructions
    5. Chapter 4 installation instructions
    6. Chapter 5 installation instructions
    7. Chapter 6 installation instructions
    8. Chapter 7 installation instructions
    9. Chapter 8 installation instructions
    10. Chapter 9 installation instructions
  12. Other Books You May Enjoy
  13. Index

Product information

  • Title: Advanced Natural Language Processing with TensorFlow 2
  • Author(s): Ashish Bansal
  • Release date: February 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781800200937