O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Natural Language Processing with PyTorch

Book Description

With Early Release ebooks, you get books in their earliest form—the author's raw and unedited content as he or she writes—so you can take advantage of these technologies long before the official release of these titles. You’ll also receive updates when significant changes are made, new chapters are available, and the final ebook bundle is released.

Natural Language Processing (NLP) offers unbounded opportunities for solving interesting problems in artificial intelligence, making it the latest frontier for developing intelligent, deep learning-based applications. If you’re a developer or researcher ready to dive deeper into this rapidly growing area of artificial intelligence, this practical book shows you how to use the PyTorch deep learning framework to implement recently discovered NLP techniques. To get started, all you need is a machine learning background and experience programming with Python.

Authors Delip Rao and Goku Mohandas provide you with a solid grounding in PyTorch, and deep learning algorithms, for building applications involving semantic representation of text. Each chapter includes several code examples and illustrations.

  • Get extensive introductions to NLP, deep learning, and PyTorch
  • Understand traditional NLP methods, including NLTK, SpaCy, and gensim
  • Explore embeddings: high quality representations for words in a language
  • Learn representations from a language sequence, using the Recurrent Neural Network (RNN)
  • Improve on RNN results with complex neural architectures, such as Long Short Term Memories (LSTM) and Gated Recurrent Units
  • Explore sequence-to-sequence models (used in translation) that read one sequence and produce another

Table of Contents

  1. Preface
  2. 1. Introduction
    1. The Supervised Learning Paradigm
    2. Observation and Target Encoding
      1. One-Hot Representation
      2. TF Representation
      3. TF-IDF Representation
      4. Target Encoding
    3. Computational Graphs
    4. PyTorch Basics
      1. Installing PyTorch
      2. Creating Tensors
      3. Tensor Types and Size
      4. Tensor Operations
      5. Indexing, slicing, and joining
      6. Tensors and Computational Graphs
      7. CUDA Tensors
    5. Exercises
    6. Solutions
    7. Summary
    8. References
  3. 2. A Quick Tour of Traditional NLP
    1. Corpora, Tokens, and Types
    2. Unigrams, Bigrams, Trigrams, …, Ngrams
    3. Lemmas and Stems
    4. Categorizing Sentences and Documents
    5. Categorizing Words: POS Tagging
    6. Categorizing Spans: Chunking and Named Entity Recognition
    7. Structure of Sentences
    8. Word Senses and Semantics
    9. Summary
    10. References
  4. 3. Foundational Components of Neural Networks
    1. Perceptron: The Simplest Neural Network
    2. Activation Functions
      1. Sigmoid
      2. Tanh
      3. ReLU
      4. Softmax
    3. Loss Functions
      1. Mean Squared Error Loss
      2. Categorical Cross-Entropy Loss
      3. Binary Cross-Entropy
    4. Diving Deep into Supervised Training
      1. Constructing Toy Data
      2. Putting It Together: Gradient-Based Supervised Learning
      3. Auxiliary Training Concepts
      4. Correctly Measuring Model Performance: Evaluation Metrics
      5. Correctly Measuring Model Performance: Splitting the Dataset
      6. Knowing When to Stop Training
      7. Finding the Right Hyperparameters
      8. Regularization
    5. Example: Classifying Sentiment of Restaurant Reviews
      1. The Yelp Review Dataset
      2. Understanding PyTorch’s Dataset Representation
      3. The Vocabulary, the Vectorizer, and the DataLoader
      4. A Perceptron Classifier
      5. The Training Routine
      6. Evaluation, Inference, and Inspection
    6. Summary
  5. 4. Feed-Forward Networks for Natural Language Processing
    1. The Multilayer Perceptron
      1. A Simple Example: XOR
      2. Implementing MLPs in PyTorch
    2. Example: Surname Classification with a Multilayer Perceptron
      1. The Surname Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The Surname Classifier Model
      4. The Training Routine
      5. Model Evaluation and Prediction
      6. Regularizing MLPs: Weight Regularization and Structural Regularization (or Dropout)
    3. Convolutional Neural Networks
      1. CNN Hyperparameters
      2. Implementing CNNs in PyTorch
    4. Example: Classifying Surnames by Using a CNN
      1. The SurnameDataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. Reimplementing the SurnameClassifier with Convolutional Networks
      4. The Training Routine
      5. Model Evaluation and Prediction
    5. Miscellaneous Topics in CNNs
      1. Pooling Operation
      2. Batch Normalization (BatchNorm)
      3. Network-in-Network Connections (1x1 Convolutions)
      4. Residual Connections/Residual Block
    6. Summary
    7. References
  6. 5. Embedding Words and Types
    1. Why Learn Embeddings?
      1. Efficiency of Embeddings
      2. Approaches to Learning Word Embeddings
      3. The Practical Use of Pretrained Word Embeddings
    2. Example: Learning the Continuous Bag of Words Embeddings
      1. Frankenstein Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The CBOW Classifier
      4. Training Routine
      5. Model Evaluation and Prediction
    3. Example: Transfer Learning Using Pretrained Embeddings for Document Classification
      1. The AG News Dataset
      2. Vocabulary, Vectorizer, and DataLoader
      3. The News Classifier
      4. The Training Routine
      5. Model Evaluation and Prediction
      6. Evaluating on the test dataset
    4. Summary
    5. References
  7. 6. Sequence Modeling for Natural Language Processing
    1. Introduction to Recurrent Neural Networks
      1. Implementing an Elman RNN
    2. Example: Classifying Surname Nationality using a Character RNN
      1. The Surnames Dataset
      2. The Vectorization Data Structures
      3. The SurnameClassifier Model
      4. The Training Routine and Results
    3. Summary
  8. 7. Intermediate Sequence Modeling for Natural Language Processing
    1. The Problem with Vanilla RNNs (or Elman RNNs)
    2. Gating as a Solution to a Vanilla RNN’s Challenges
    3. Example: A Character-RNN for Generating Surnames
      1. The SurnamesDataset
      2. The Vectorization Data Structures
      3. From the ElmanRNN to the GRU
      4. Model 1: Unconditioned Surname Generation Model
      5. Model 2: Conditioned Surname Generation Model
      6. Training Routine and Results
    4. Tips and Tricks for Training Sequence Models
    5. References
  9. 8. Advanced Sequence Modeling for Natural Language Processing
    1. Sequence-to-Sequence Models, Encoder–Decoder Models, and Conditioned Generation
    2. Capturing More from a Sequence: Bidirectional Recurrent Models
    3. Capturing More from a Sequence: Attention
      1. Attention in Deep Neural Networks
    4. Evaluating Sequence Generation Models
    5. Example: Neural Machine Translation
      1. Machine Translation Dataset
      2. A Vectorization Pipeline for NMT
      3. Encoding and Decoding in the NMT Model
      4. Training Routine and Results
    6. Summary
    7. References
  10. 9. Classics, Frontiers, and Next Steps
    1. What Have We Learned so Far?
    2. Timeless Topics in NLP
      1. Dialogue and Interactive Systems
      2. Discourse
      3. Information Extraction and Text Mining
      4. Document Analysis and Retrieval
    3. Frontiers in NLP
    4. Design Patterns for Production NLP Systems
    5. Where Next?
    6. References