O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Natural Language Processing with PyTorch

Book Description

Natural Language Processing (NLP) provides boundless opportunities for solving problems in artificial intelligence, making products such as Amazon Alexa and Google Translate possible. If you’re a developer or data scientist new to NLP and deep learning, this practical guide shows you how to apply these methods using PyTorch, a Python-based deep learning library.

Authors Delip Rao and Brian McMahon provide you with a solid grounding in NLP and deep learning algorithms and demonstrate how to use PyTorch to build applications involving rich representations of text specific to the problems you face. Each chapter includes several code examples and illustrations.

  • Explore computational graphs and the supervised learning paradigm
  • Master the basics of the PyTorch optimized tensor manipulation library
  • Get an overview of traditional NLP concepts and methods
  • Learn the basic ideas involved in building neural networks
  • Use embeddings to represent words, sentences, documents, and other features
  • Explore sequence prediction and generate sequence-to-sequence models
  • Learn design patterns for building production NLP systems

Table of Contents

  1. Preface
    1. Conventions Used in This Book
    2. Using Code Examples
    3. O’Reilly Safari
    4. How to Contact Us
    5. Acknowledments
  2. 1. Introduction
    1. The Supervised Learning Paradigm
    2. Observation and Target Encoding
      1. One-Hot Representation
      2. TF Representation
      3. TF-IDF Representation
      4. Target Encoding
    3. Computational Graphs
    4. PyTorch Basics
      1. Installing PyTorch
      2. Creating Tensors
      3. Tensor Types and Size
      4. Tensor Operations
      5. Indexing, Slicing, and Joining
      6. Tensors and Computational Graphs
      7. CUDA Tensors
    5. Exercises
    6. Solutions
    7. Summary
    8. References
  3. 2. A Quick Tour of Traditional NLP
    1. Corpora, Tokens, and Types
    2. Unigrams, Bigrams, Trigrams, …, N-grams
    3. Lemmas and Stems
    4. Categorizing Sentences and Documents
    5. Categorizing Words: POS Tagging
    6. Categorizing Spans: Chunking and Named Entity Recognition
    7. Structure of Sentences
    8. Word Senses and Semantics
    9. Summary
    10. References
  4. 3. Foundational Components of Neural Networks
    1. The Perceptron: The Simplest Neural Network
    2. Activation Functions
      1. Sigmoid
      2. Tanh
      3. ReLU
      4. Softmax
    3. Loss Functions
      1. Mean Squared Error Loss
      2. Categorical Cross-Entropy Loss
      3. Binary Cross-Entropy Loss
    4. Diving Deep into Supervised Training
      1. Constructing Toy Data
      2. Putting It Together: Gradient-Based Supervised Learning
    5. Auxiliary Training Concepts
      1. Correctly Measuring Model Performance: Evaluation Metrics
      2. Correctly Measuring Model Performance: Splitting the Dataset
      3. Knowing When to Stop Training
      4. Finding the Right Hyperparameters
      5. Regularization
    6. Example: Classifying Sentiment of Restaurant Reviews
      1. The Yelp Review Dataset
      2. Understanding PyTorch’s Dataset Representation
      3. The Vocabulary, the Vectorizer, and the DataLoader
      4. A Perceptron Classifier
      5. The Training Routine
      6. Evaluation, Inference, and Inspection
    7. Summary
    8. References
  5. 4. Feed-Forward Networks for Natural Language Processing
    1. The Multilayer Perceptron
      1. A Simple Example: XOR
      2. Implementing MLPs in PyTorch
    2. Example: Surname Classification with an MLP
      1. The Surnames Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The SurnameClassifier Model
      4. The Training Routine
      5. Model Evaluation and Prediction
      6. Regularizing MLPs: Weight Regularization and Structural Regularization (or Dropout)
    3. Convolutional Neural Networks
      1. CNN Hyperparameters
      2. Implementing CNNs in PyTorch
    4. Example: Classifying Surnames by Using a CNN
      1. The SurnameDataset Class
      2. Vocabulary, Vectorizer, and DataLoader
      3. Reimplementing the SurnameClassifier with Convolutional Networks
      4. The Training Routine
      5. Model Evaluation and Prediction
    5. Miscellaneous Topics in CNNs
      1. Pooling
      2. Batch Normalization (BatchNorm)
      3. Network-in-Network Connections (1x1 Convolutions)
      4. Residual Connections/Residual Block
    6. Summary
    7. References
  6. 5. Embedding Words and Types
    1. Why Learn Embeddings?
      1. Efficiency of Embeddings
      2. Approaches to Learning Word Embeddings
      3. The Practical Use of Pretrained Word Embeddings
    2. Example: Learning the Continuous Bag of Words Embeddings
      1. The Frankenstein Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The CBOWClassifier Model
      4. The Training Routine
      5. Model Evaluation and Prediction
    3. Example: Transfer Learning Using Pretrained Embeddings for Document Classification
      1. The AG News Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The NewsClassifier Model
      4. The Training Routine
      5. Model Evaluation and Prediction
      6. Evaluating on the test dataset
    4. Summary
    5. References
  7. 6. Sequence Modeling for Natural Language Processing
    1. Introduction to Recurrent Neural Networks
      1. Implementing an Elman RNN
    2. Example: Classifying Surname Nationality Using a Character RNN
      1. The SurnameDataset Class
      2. The Vectorization Data Structures
      3. The SurnameClassifier Model
      4. The Training Routine and Results
    3. Summary
    4. References
  8. 7. Intermediate Sequence Modeling for Natural Language Processing
    1. The Problem with Vanilla RNNs (or Elman RNNs)
    2. Gating as a Solution to a Vanilla RNN’s Challenges
    3. Example: A Character RNN for Generating Surnames
      1. The SurnameDataset Class
      2. The Vectorization Data Structures
      3. From the ElmanRNN to the GRU
      4. Model 1: The Unconditioned SurnameGenerationModel
      5. Model 2: The Conditioned SurnameGenerationModel
      6. The Training Routine and Results
    4. Tips and Tricks for Training Sequence Models
    5. References
  9. 8. Advanced Sequence Modeling for Natural Language Processing
    1. Sequence-to-Sequence Models, Encoder–Decoder Models, and Conditioned Generation
    2. Capturing More from a Sequence: Bidirectional Recurrent Models
    3. Capturing More from a Sequence: Attention
      1. Attention in Deep Neural Networks
    4. Evaluating Sequence Generation Models
    5. Example: Neural Machine Translation
      1. The Machine Translation Dataset
      2. A Vectorization Pipeline for NMT
      3. Encoding and Decoding in the NMT Model
      4. The Training Routine and Results
    6. Summary
    7. References
  10. 9. Classics, Frontiers, and Next Steps
    1. What Have We Learned so Far?
    2. Timeless Topics in NLP
      1. Dialogue and Interactive Systems
      2. Discourse
      3. Information Extraction and Text Mining
      4. Document Analysis and Retrieval
    3. Frontiers in NLP
    4. Design Patterns for Production NLP Systems
    5. Where Next?
    6. References
  11. Index