O'Reilly logo
live online training icon Live Online training

Deep Learning for Natural Language Processing (NLP): Complete Artificial Intelligence Series

enter image description here

Powerful, Efficient Processing of Natural Language with Deep Neural Networks

Jon Krohn

Relatively obscure a few short years ago, Deep Learning is ubiquitous today across data-driven applications as diverse as machine vision, super-human game-playing, and natural language processing (NLP).

This Live Training builds on the fundamentals of Deep Learning to develop a specialization in handling natural language data and building powerful, efficient, broadly-applicable predictive models that have sequences of words as inputs.

To facilitate an intuitive understanding of NLP and neural-network layers particularly well-suited to processing natural language data (e.g., word vectors, RNNs, GRUs, LSTMs), essential theory will be introduced visually and pragmatically. Theory will immediately be brought to life with interactive demos and hands-on exercises within Jupyter notebooks that feature Python, TensorFlow 2, and and Keras layers, the high-level TensorFlow API.

This is part of Jon Krohn’s Complete Artificial Intelligence Series, a collection of interactive trainings that together comprehensively cover the foundations of modern AI approaches. The recommended progression through the Series is to take one of these two introductory sessions:

Following either of the introductory sessions (or if you’re familiar with the content covered in Chapters 1 and 5-9 of Jon Krohn’s Deep Learning Illustrated book), you’re well-prepared to specialize in any of the other Live Trainings in the Complete Artificial Intelligence Series, which can be undertaken in any order you fancy:

(Note that at any given time, only a subset of these classes will be scheduled and open for registration. To be pushed notifications of upcoming classes in the series, sign up for the instructor’s email newsletter at jonkrohn.com.)

What you'll learn-and how you can apply it

  • Preprocess natural language data and create word vectors for use in machine learning applications
  • Leverage Keras and its TensorFlow backend to make predictions with Deep Learning models trained on natural language
  • Improve Deep Learning model performance by tuning hyperparameters

This training course is for you because...

  • You already have a working understanding of the fundamentals of Deep Learning
  • You want to apply state-of-the-art Deep Learning models to natural language data
  • You want to be able to transform natural language into quantitative representations that can be used as inputs into a broad range of machine learning models


  • Experience with an object-oriented programming language, e.g., Python (all code demos during the training will be in Python)
  • A working understanding of the fundamentals of Deep Learning would make it a lot easier to follow along with the training

Materials, downloads, or Supplemental Content needed in advance::

  • During class, we’ll work on Jupyter notebooks interactively in the cloud via Google Colab. This requires nearly zero setup and instructions will be provided in class. If you’d like to take a sneak peak at the notebooks we’ll be using, check out https://github.com/jonkrohn/DLTFpT


About your instructor

  • Jon Krohn is Chief Data Scientist at the machine learning company untapt. He authored the 2019 book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon’s also the presenter of dozens of hours of popular video tutorials such as Deep Learning with TensorFlow, Keras, and PyTorch. And he’s renowned for his compelling lectures, which he offers in-person at Columbia University, New York University, and the NYC Data Science Academy. Jon holds a PhD in neuroscience from Oxford and has been publishing on machine learning in leading academic journals since 2010.


The timeframes are only estimates and may vary according to how the class is progressing

Segment 1: The Power and Elegance of Deep Learning for NLP (45 min)

  • Training Overview
  • Introduction to Deep Learning for Natural Language Processing
  • Easy, Intermediate, and Complex NLP Applications
  • Deep Learning vs Traditional Machine Learning
  • Review of Prerequisite Deep Learning Theory, including Artificial Neurons, Activation Functions, Cost Functions, Gradient Descent, Backpropagation, Weight Initialization, Dense Layers, Convolutional Layers, Max-Pooling, and Dropout
  • Word Vectors: Representing Language as Embeddings
  • Word Vector Arithmetic
  • An Interactive Visualization of Vector-Space Embeddings
  • Vector-Based Representations vs One-Hot Encodings
  • Break + Q&A (5 minutes)

Segment 2: Modeling Natural Language Data (90 min)

  • Best Practices for Preprocessing Natural Language Data for Machine Learning Applications
  • Using word2vec to Create Word Vectors
  • Document Classification with a Dense Neural Network
  • Document Classification with a Convolutional Neural Network
  • Break + Q&A (5 minutes)

Segment 3: Recurrent and Advanced Neural Networks (45 min)

  • Recurrent Neural Networks (RNNs)
  • Long Short-Term Memory Units (LSTMs)
  • Gated Recurrent Units (GRUs)
  • Bi-Directional LSTMs
  • Stacked LSTMs
  • Parallel Network Architectures
  • Transformer Architectures: BERT, ELMo, GPT-2, and Friends

Break + Q&A (5 minutes)