Deep Learning for Natural Language Processing

Book description

Gain knowledge of various deep neural network architectures and their areas of application to conquer your NLP issues

Key Features

  • Gain insights into the basic building blocks of natural language processing
  • Learn how to select the best deep neural network to solve your NLP problems
  • Explore convolutional and recurrent neural networks and long short-term memory networks

Book Description

Applying deep learning approaches to various NLP tasks can take your computational algorithms to a completely new level in terms of speed and accuracy. Deep Learning for Natural Language Processing starts by highlighting the basic building blocks of the natural language processing domain.

The book goes on to introduce the problems that you can solve using state-of-the-art neural network models. After this, delving into the various neural network architectures and their specific areas of application will help you to understand how to select the best model to suit your needs. As you advance through this deep learning book, you'll study convolutional, recurrent, and recursive neural networks, in addition to covering long short-term memory networks (LSTM). Understanding these networks will help you to implement their models using Keras. In later chapters, you will be able to develop a trigger word detection application using NLP techniques such as attention model and beam search.

By the end of this book, you will not only have sound knowledge of natural language processing, but also be able to select the best text preprocessing and neural network models to solve a number of NLP issues.

What you will learn

  • Understand various preprocessing techniques for solving deep learning problems
  • Build a vector representation of text using word2vec and GloVe
  • Create a named entity recognizer and parts-of-speech tagger with Apache OpenNLP
  • Build a machine translation model in Keras
  • Develop a text generation application using LSTM
  • Build a trigger word detection application using an attention model

Who this book is for

If you're an aspiring data scientist looking for an introduction to deep learning in the NLP domain, this is just the book for you. Strong working knowledge of Python, linear algebra, and machine learning is a must.

Table of contents

  1. Preface
    1. About the Book
      1. About the Authors
      2. Description
      3. Learning Objectives
      4. Audience
      5. Approach
      6. Hardware Requirements
      7. Software Requirements
      8. Conventions
      9. Installation and Setup
      10. Install Python on Windows
      11. Install Python on Linux
      12. Install Python on macOS X
      13. Installing Keras
      14. Additional Resources
  2. Chapter 1
  3. Introduction to Natural Language Processing
    1. Introduction
    2. The Basics of Natural Language Processing
      1. Importance of natural language processing
    3. Capabilities of Natural language processing
    4. Applications of Natural Language Processing
      1. Text Preprocessing
      2. Text Preprocessing Techniques
      3. Lowercasing/Uppercasing
      4. Exercise 1: Performing Lowercasing on a Sentence
      5. Noise Removal
      6. Exercise 2: Removing Noise from Words
      7. Text Normalization
      8. Stemming
      9. Exercise 3: Performing Stemming on Words
      10. Lemmatization
      11. Exercise 4: Performing Lemmatization on Words
      12. Tokenization
      13. Exercise 5: Tokenizing Words
      14. Exercise 6: Tokenizing Sentences
      15. Additional Techniques
      16. Exercise 7: Removing Stop Words
    5. Word Embeddings
      1. The Generation of Word Embeddings
      2. Word2Vec
      3. Functioning of Word2Vec
      4. Exercise 8: Generating Word Embeddings Using Word2Vec
      5. GloVe
      6. Exercise 9: Generating Word Embeddings Using GloVe
      7. Activity 1: Generating Word Embeddings from a Corpus Using Word2Vec.
    6. Summary
  4. Chapter 2
  5. Applications of Natural Language Processing
    1. Introduction
    2. POS Tagging
      1. Parts of Speech
      2. POS Tagger
    3. Applications of Parts of Speech Tagging
      1. Types of POS Taggers
      2. Rule-Based POS Taggers
      3. Exercise 10: Performing Rule-Based POS Tagging
      4. Stochastic POS Taggers
      5. Exercise 11: Performing Stochastic POS Tagging
    4. Chunking
      1. Exercise 12: Performing Chunking with NLTK
      2. Exercise 13: Performing Chunking with spaCy
    5. Chinking
      1. Exercise 14: Performing Chinking
      2. Activity 2: Building and Training Your Own POS Tagger
    6. Named Entity Recognition
      1. Named Entities
      2. Named Entity Recognizers
      3. Applications of Named Entity Recognition
      4. Types of Named Entity Recognizers
      5. Rule-Based NERs
      6. Stochastic NERs
      7. Exercise 15: Perform Named Entity Recognition with NLTK
      8. Exercise 16: Performing Named Entity Recognition with spaCy
      9. Activity 3: Performing NER on a Tagged Corpus
    7. Summary
  6. Chapter 3
  7. Introduction to Neural Networks
    1. Introduction
      1. Introduction to Deep Learning
      2. Comparing Machine Learning and Deep Learning
    2. Neural Networks
      1. Neural Network Architecture
      2. The Layers
      3. Nodes
      4. The Edges
      5. Biases
      6. Activation Functions
    3. Training a Neural Network
      1. Calculating Weights
      2. The Loss Function
      3. The Gradient Descent Algorithm
      4. Backpropagation
    4. Designing a Neural Network and Its Applications
      1. Supervised neural networks
      2. Unsupervised neural networks
      3. Exercise 17: Creating a neural network
    5. Fundamentals of Deploying a Model as a Service
      1. Activity 4: Sentiment Analysis of Reviews
    6. Summary
  8. Chapter 4
  9. Foundations of Convolutional Neural Network
    1. Introduction
      1. Exercise 18: Finding Out How Computers See Images
    2. Understanding the Architecture of a CNN
      1. Feature Extraction
      2. Convolution
      3. The ReLU Activation Function
      4. Exercise 19: Visualizing ReLU
      5. Pooling
      6. Dropout
      7. Classification in Convolutional Neural Network
      8. Exercise 20: Creating a Simple CNN Architecture
    3. Training a CNN
      1. Exercise 21: Training a CNN
      2. Applying CNNs to Text
      3. Exercise 22: Application of a Simple CNN to a Reuters News Topic for Classification
    4. Application Areas of CNNs
      1. Activity 5: Sentiment Analysis on a Real-life Dataset
    5. Summary
  10. Chapter 5
  11. Recurrent Neural Networks
    1. Introduction
    2. Previous Versions of Neural Networks
    3. RNNs
      1. RNN Architectures
      2. BPTT
    4. Updates and Gradient Flow
      1. Adjusting Weight Matrix Wy
      2. Adjusting Weight Matrix Ws
      3. For Updating Wx
    5. Gradients
      1. Exploding Gradients
      2. Vanishing Gradients
      3. RNNs with Keras
      4. Exercise 23: Building an RNN Model to Show the Stability of Parameters over Time
      5. Stateful versus Stateless
      6. Exercise 24: Turning a Stateless Network into a Stateful Network by Only Changing Arguments
      7. Activity 6: Solving a Problem with an RNN – Author Attribution
    6. Summary
  12. Chapter 6
  13. Gated Recurrent Units (GRUs)
    1. Introduction
    2. The Drawback of Simple RNNs
      1. The Exploding Gradient Problem
    3. Gated Recurrent Units (GRUs)
      1. Types of Gates
      2. The Update Gate
      3. The Reset Gate
      4. The Candidate Activation Function
      5. GRU Variations
    4. Sentiment Analysis with GRU
      1. Exercise 25: Calculating the Model Validation Accuracy and Loss for Sentiment Classification
      2. Activity 7: Developing a Sentiment Classification Model Using a Simple RNN
      3. Text Generation with GRUs
      4. Exercise 26: Generating Text Using GRUs
      5. Activity 8: Train Your Own Character Generation Model Using a Dataset of Your Choice
    5. Summary
  14. Chapter 7
  15. Long Short-Term Memory (LSTM)
    1. Introduction
      1. LSTM
      2. The Forget Gate
    2. The Input Gate and the Candidate Cell State
      1. Cell State Update
    3. Output Gate and Current Activation
      1. Exercise 27: Building an LSTM-Based Model to Classify an Email as Spam or Not Spam (Ham)
      2. Activity 9: Building a Spam or Ham Classifier Using a Simple RNN
    4. Neural Language Translation
      1. Activity 10: Creating a French-to-English translation model
    5. Summary
  16. Chapter 8
  17. State-of-the-Art Natural Language Processing
    1. Introduction
      1. Attention Mechanisms
      2. An Attention Mechanism Model
      3. Data Normalization Using an Attention Mechanism
      4. Encoder
      5. Decoder
      6. Attention mechanisms
      7. The Calculation of Alpha
      8. Exercise 28: Build a Date Normalization Model for a Database Column
    2. Other Architectures and Developments
      1. Transformer
      2. BERT
      3. Open AI GPT-2
    3. Activity 11: Build a Text Summarization Model
    4. Summary
  18. Chapter 9
  19. A Practical NLP Project Workflow in an Organization
    1. Introduction
      1. General Workflow for the Development of a Machine Learning Product
      2. The Presentation Workflow:
      3. The Research Workflow:
      4. The Production-Oriented Workflow
    2. Problem Definition
    3. Data Acquisition
    4. Google Colab
    5. Flask
    6. Deployment
      1. Making Changes to a Flask Web App
      2. Use Docker to Wrap the Flask Web Application into a Container
      3. Host the Container on an Amazon Web Services (AWS) EC2 instance
      4. Improvements
    7. Summary
  20. Appendix
    1. Chapter 1: Introduction to Natural Language Processing
      1. Activity 1: Generating word embeddings from a corpus using Word2Vec.
    2. Chapter 2: Applications of Natural Language Processing
      1. Activity 2: Building and training your own POS tagger
      2. Activity 3: Performing NER on a Tagged Corpus
    3. Chapter 3: Introduction to Neural Networks
      1. Activity 4: Sentiment Analysis of Reviews
    4. Chapter 4: Introduction to convolutional networks
      1. Activity 5: Sentiment Analysis on a real-life dataset
    5. Chapter 5: Foundations of Recurrent Neural Network
      1. Activity 6: Solve a problem with RNN – Author Attribution
      2. Prepare the data
      3. Applying the Model to the Unknown Papers
    6. Chapter 6: Foundations of GRUs
      1. Activity 7: Develop a sentiment classification model using Simple RNN
      2. Activity 8: Train your own character generation model with a dataset of your choice
    7. Chapter 7: Foundations of LSTM
      1. Activity 10: Create a French to English translation model
    8. Chapter 8: State of the art in Natural Language Processing
      1. Activity 11: Build a Text Summarization Model
    9. Chapter 9: A practical NLP project workflow in an organisation
      1. Code for LSTM model
      2. Code for Flask

Product information

  • Title: Deep Learning for Natural Language Processing
  • Author(s): Karthiek Reddy Bokka, Shubhangi Hora, Tanuj Jain, Monicah Wambugu
  • Release date: June 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781838550295