O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Generative Deep Learning

Book Description

Generative modeling is one of the hottest topics in AI. It’s now possible to teach a machine to excel at human endeavors such as painting, writing, and composing music. With this practical book, machine-learning engineers and data scientists will discover how to re-create some of the most impressive examples of generative deep learning models, such as variational autoencoders,generative adversarial networks (GANs), encoder-decoder models and world models.

Author David Foster demonstrates the inner workings of each technique, starting with the basics of deep learning before advancing to some of the most cutting-edge algorithms in the field. Through tips and tricks, you’ll understand how to make your models learn more efficiently and become more creative.

  • Discover how variational autoencoders can change facial expressions in photos
  • Build practical GAN examples from scratch, including CycleGAN for style transfer and MuseGAN for music generation
  • Create recurrent generative models for text generation and learn how to improve the models using attention
  • Understand how generative models can help agents to accomplish tasks within a reinforcement learning setting
  • Explore the architecture of the Transformer (BERT, GPT-2) and image generation models such as ProGAN and StyleGAN

Table of Contents

  1. Preface
    1. Objective and Approach
    2. Prerequisites
    3. Other Resources
    4. Conventions Used in This Book
    5. Using Code Examples
    6. O’Reilly Online Learning
    7. How to Contact Us
    8. Acknowledgments
  2. I. Introduction to Generative Deep Learning
  3. 1. Generative Modeling
    1. What Is Generative Modeling?
      1. Generative Versus Discriminative Modeling
      2. Advances in Machine Learning
      3. The Rise of Generative Modeling
      4. The Generative Modeling Framework
    2. Probabilistic Generative Models
      1. Hello Wrodl!
      2. Your First Probabilistic Generative Model
      3. Naive Bayes
      4. Hello Wrodl! Continued
    3. The Challenges of Generative Modeling
      1. Representation Learning
    4. Setting Up Your Environment
    5. Summary
  4. 2. Deep Learning
    1. Structured and Unstructured Data
    2. Deep Neural Networks
      1. Keras and TensorFlow
    3. Your First Deep Neural Network
      1. Loading the Data
      2. Building the Model
      3. Compiling the Model
      4. Training the Model
      5. Evaluating the Model
    4. Improving the Model
      1. Convolutional Layers
      2. Batch Normalization
      3. Dropout Layers
      4. Putting It All Together
    5. Summary
  5. 3. Variational Autoencoders
    1. The Art Exhibition
    2. Autoencoders
      1. Your First Autoencoder
      2. The Encoder
      3. The Decoder
      4. Joining the Encoder to the Decoder
      5. Analysis of the Autoencoder
    3. The Variational Art Exhibition
    4. Building a Variational Autoencoder
      1. The Encoder
      2. The Loss Function
      3. Analysis of the Variational Autoencoder
    5. Using VAEs to Generate Faces
      1. Training the VAE
      2. Analysis of the VAE
      3. Generating New Faces
      4. Latent Space Arithmetic
      5. Morphing Between Faces
    6. Summary
  6. 4. Generative Adversarial Networks
    1. Ganimals
    2. Introduction to GANs
    3. Your First GAN
      1. The Discriminator
      2. The Generator
      3. Training the GAN
    4. GAN Challenges
      1. Oscillating Loss
      2. Mode Collapse
      3. Uninformative Loss
      4. Hyperparameters
      5. Tackling the GAN Challenges
    5. Wasserstein GAN
      1. Wasserstein Loss
      2. The Lipschitz Constraint
      3. Weight Clipping
      4. Training the WGAN
      5. Analysis of the WGAN
    6. WGAN-GP
      1. The Gradient Penalty Loss
      2. Analysis of WGAN-GP
    7. Summary
  7. II. Teaching Machines to Paint, Write, Compose, and Play
  8. 5. Paint
    1. Apples and Organges
    2. CycleGAN
    3. Your First CycleGAN
      1. Overview
      2. The Generators (U-Net)
      3. The Discriminators
      4. Compiling the CycleGAN
      5. Training the CycleGAN
      6. Analysis of the CycleGAN
    4. Creating a CycleGAN to Paint Like Monet
      1. The Generators (ResNet)
      2. Analysis of the CycleGAN
    5. Neural Style Transfer
      1. Content Loss
      2. Style Loss
      3. Total Variance Loss
      4. Running the Neural Style Transfer
      5. Analysis of the Neural Style Transfer Model
    6. Summary
  9. 6. Write
    1. The Literary Society for Troublesome Miscreants
    2. Long Short-Term Memory Networks
    3. Your First LSTM Network
      1. Tokenization
      2. Building the Dataset
      3. The LSTM Architecture
      4. The Embedding Layer
      5. The LSTM Layer
      6. The LSTM Cell
    4. Generating New Text
    5. RNN Extensions
      1. Stacked Recurrent Networks
      2. Gated Recurrent Units
      3. Bidirectional Cells
    6. Encoder–Decoder Models
    7. A Question and Answer Generator
      1. A Question-Answer Dataset
      2. Model Architecture
      3. Inference
      4. Model Results
    8. Summary
  10. 7. Compose
    1. Preliminaries
      1. Musical Notation
    2. Your First Music-Generating RNN
      1. Attention
      2. Building an Attention Mechanism in Keras
      3. Analysis of the RNN with Attention
      4. Attention in Encoder–Decoder Networks
      5. Generating Polyphonic Music
    3. The Musical Organ
    4. Your First MuseGAN
    5. The MuseGAN Generator
      1. Chords, Style, Melody, and Groove
      2. The Bar Generator
      3. Putting It All Together
    6. The Critic
    7. Analysis of the MuseGAN
    8. Summary
  11. 8. Play
    1. Reinforcement Learning
      1. OpenAI Gym
    2. World Model Architecture
      1. The Variational Autoencoder
      2. The MDN-RNN
      3. The Controller
    3. Setup
    4. Training Process Overview
    5. Collecting Random Rollout Data
    6. Training the VAE
      1. The VAE Architecture
      2. Exploring the VAE
    7. Collecting Data to Train the RNN
    8. Training the MDN-RNN
      1. The MDN-RNN Architecture
      2. Sampling the Next z and Reward from the MDN-RNN
      3. The MDN-RNN Loss Function
    9. Training the Controller
      1. The Controller Architecture
      2. CMA-ES
      3. Parallelizing CMA-ES
      4. Output from the Controller Training
    10. In-Dream Training
      1. In-Dream Training the Controller
      2. Challenges of In-Dream Training
    11. Summary
  12. 9. The Future of Generative Modeling
    1. Five Years of Progress
    2. The Transformer
      1. Positional Encoding
      2. Multihead Attention
      3. The Decoder
      4. Analysis of the Transformer
      5. BERT
      6. GPT-2
      7. MuseNet
    3. Advances in Image Generation
      1. ProGAN
      2. Self-Attention GAN (SAGAN)
      3. BigGAN
      4. StyleGAN
    4. Applications of Generative Modeling
      1. AI Art
      2. AI Music
  13. 10. Conclusion
  14. Index