book

Inside Deep Learning

Name: Inside Deep Learning
Author: Edward Raff
ISBN: 9781617298639

by Edward Raff

June 2022

Intermediate to advanced

600 pages

17h 56m

English

Manning Publications

Read now

Unlock full access

inside front cover
Inside Deep Learning
Copyright
dedication
contents
front matter
ForewordPrefaceAcknowledgmentsAbout this bookWho should read this book?How this book is organized: A road mapAbout the mathematical notationsAbout the exercisesAbout Google ColabAbout the codeliveBook discussion forumOther online resourcesAbout the authorAbout the cover
Part 1. Foundational methods
1 The mechanics of learning
1.1 Getting started with Colab1.2 The world as tensors1.2.1 PyTorch GPU acceleration1.3 Automatic differentiation1.3.1 Using derivatives to minimize losses1.3.2 Calculating a derivative with automatic differentiation1.3.3 Putting it together: Minimizing a function with derivatives1.4 Optimizing parameters1.5 Loading dataset objects1.5.1 Creating a training and testing splitExercisesSummary
2 Fully connected networks
2.1 Neural networks as optimization2.1.1 Notation of training a neural network2.1.2 Building a linear regression model2.1.3 The training loop2.1.4 Defining a dataset2.1.5 Defining the model2.1.6 Defining the loss function2.1.7 Putting it together: Training a linear regression model on the data2.2 Building our first neural network2.2.1 Notation for a fully connected network2.2.2 A fully connected network in PyTorch2.2.3 Adding nonlinearities2.3 Classification problems2.3.1 Classification toy problem2.3.2 Classification loss function2.3.3 Training a classification network2.4 Better training code2.4.1 Custom metrics2.4.2 Training and testing passes2.4.3 Saving checkpoints2.4.4 Putting it all together: A better model training function2.5 Training in batchesExercisesSummary
3 Convolutional neural networks
3.1 Spatial structural prior beliefs3.1.1 Loading MNIST with PyTorch3.2 What are convolutions?3.2.1 1D convolutions3.2.2 2D convolutions3.2.3 Padding3.2.4 Weight sharing3.3 How convolutions benefit image processing3.4 Putting it into practice: Our first CNN3.4.1 Making a convolutional layer with multiple filters3.4.2 Using multiple filters per layer3.4.3 Mixing convolutional layers with linear layers via flattening3.4.4 PyTorch code for our first CNN3.5 Adding pooling to mitigate object movement3.5.1 CNNs with max pooling3.6 Data augmentationExercisesSummary

4 Recurrent neural networks
4.1 Recurrent neural networks as weight sharing4.1.1 Weight sharing for a fully connected network4.1.2 Weight sharing over time4.2 RNNs in PyTorch4.2.1 A simple sequence classification problem4.2.2 Embedding layers4.2.3 Making predictions using the last time step4.3 Improving training time with packing4.3.1 Pad and pack4.3.2 Packable embedding layer4.3.3 Training a batched RNN4.3.4 Simultaneous packed and unpacked inputs4.4 More complex RNNs4.4.1 Multiple layers4.4.2 Bidirectional RNNsExercisesSummary
5 Modern training techniques
5.1 Gradient descent in two parts5.1.1 Adding a learning rate schedule5.1.2 Adding an optimizer5.1.3 Implementing optimizers and schedulers5.2 Learning rate schedules5.2.1 Exponential decay: Smoothing erratic training5.2.2 Step drop adjustment: Better smoothing5.2.3 Cosine annealing: Greater accuracy but less stability5.2.4 Validation plateau: Data-based adjustments5.2.5 Comparing the schedules5.3 Making better use of gradients5.3.1 SGD with momentum: Adapting to gradient consistency5.3.2 Adam: Adding variance to momentum5.3.3 Gradient clipping: Avoiding exploding gradients5.4 Hyperparameter optimization with Optuna5.4.1 Optuna5.4.2 Optuna with PyTorch5.4.3 Pruning trials with OptunaExercisesSummary
6 Common design building blocks
6.1 Better activation functions6.1.1 Vanishing gradients6.1.2 Rectified linear units (ReLUs): Avoiding vanishing gradients6.1.3 Training with LeakyReLU activations6.2 Normalization layers: Magically better convergence6.2.1 Where do normalization layers go?6.2.2 Batch normalization6.2.3 Training with batch normalization6.2.4 Layer normalization6.2.5 Training with layer normalization6.2.6 Which normalization layer to use?6.2.7 A peculiarity of layer normalization6.3 Skip connections: A network design pattern6.3.1 Implementing fully connected skips6.3.2 Implementing convolutional skips6.4 1 × 1 Convolutions: Sharing and reshaping information in channels6.4.1 Training with 1 × 1 convolutions6.5 Residual connections6.5.1 Residual blocks6.5.2 Implementing residual blocks6.5.3 Residual bottlenecks6.5.4 Implementing residual bottlenecks6.6 Long short-term memory RNNs6.6.1 RNNs: A fast review6.6.2 LSTMs and the gating mechanism6.6.3 Training an LSTMExercisesSummary
Part 2. Building advanced networks
7 Autoencoding and self-supervision
7.1 How autoencoding works7.1.1 Principle component analysis is a bottleneck autoencoder7.1.2 Implementing PCA7.1.3 Implementing PCA with PyTorch7.1.4 Visualizing PCA results7.1.5 A simple nonlinear PCA7.2 Designing autoencoding neural networks7.2.1 Implementing an autoencoder7.2.2 Visualizing autoencoder results7.3 Bigger autoencoders7.3.1 Robustness to noise7.4 Denoising autoencoders7.4.1 Denoising with Gaussian noise7.5 Autoregressive models for time series and sequences7.5.1 Implementing the char-RNN autoregressive text model7.5.2 Autoregressive models are generative models7.5.3 Changing samples with temperature7.5.4 Faster samplingExercisesSummary
8 Object detection
8.1 Image segmentation8.1.1 Nuclei detection: Loading the data8.1.2 Representing the image segmentation problem in PyTorch8.1.3 Building our first image segmentation network8.2 Transposed convolutions for expanding image size8.2.1 Implementing a network with transposed convolutions8.3 U-Net: Looking at fine and coarse details8.3.1 Implementing U-Net8.4 Object detection with bounding boxes8.4.1 Faster R-CNN8.4.2 Using Faster R-CNN in PyTorch8.4.3 Suppressing overlapping boxes8.5 Using the pretrained Faster R-CNNExercisesSummary
9 Generative adversarial networks
9.1 Understanding generative adversarial networks9.1.1 The loss computations9.1.2 The GAN games9.1.3 Implementing our first GAN9.2 Mode collapse9.3 Wasserstein GAN: Mitigating mode collapse9.3.1 WGAN discriminator loss9.3.2 WGAN generator loss9.3.3 Implementing WGAN9.4 Convolutional GAN9.4.1 Designing a convolutional generator9.4.2 Designing a convolutional discriminator9.5 Conditional GAN9.5.1 Implementing a conditional GAN9.5.2 Training a conditional GAN9.5.3 Controlling the generation with conditional GANs9.6 Walking the latent space of GANs9.6.1 Getting models from the Hub9.6.2 Interpolating GAN output9.6.3 Labeling latent dimensions9.7 Ethics in deep learningExercisesSummary
10 Attention mechanisms
10.1 Attention mechanisms learn relative input importance10.1.1 Training our baseline model10.1.2 Attention mechanism mechanics10.1.3 Implementing a simple attention mechanism10.2 Adding some context10.2.1 Dot score10.2.2 General score10.2.3 Additive attention10.2.4 Computing attention weights10.3 Putting it all together: A complete attention mechanism with contextExercisesSummary
11 Sequence-to-sequence
11.1 Sequence-to-sequence as a kind of denoising autoencoder11.1.1 Adding attention creates Seq2Seq11.2 Machine translation and the data loader11.2.1 Loading a small English-French dataset11.3 Inputs to Seq2Seq11.3.1 Autoregressive approach11.3.2 Teacher-forcing approach11.3.3 Teacher forcing vs. an autoregressive approach11.4 Seq2Seq with attention11.4.1 Implementing Seq2Seq11.4.2 Training and evaluationExercisesSummary
12 Network design alternatives to RNNs
12.1 TorchText: Tools for text problems12.1.1 Installing TorchText12.1.2 Loading datasets in TorchText12.1.3 Defining a baseline model12.2 Averaging embeddings over time12.2.1 Weighted average over time with attention12.3 Pooling over time and 1D CNNs12.4 Positional embeddings add sequence information to any model12.4.1 Implementing a positional encoding module12.4.2 Defining positional encoding models12.5 Transformers: Big models for big data12.5.1 Multiheaded attention12.5.2 Transformer blocksExercisesSummary
13 Transfer learning
13.1 Transferring model parameters13.1.1 Preparing an image dataset13.2 Transfer learning and training with CNNs13.2.1 Adjusting pretrained networks13.2.2 Preprocessing for pretrained ResNet13.2.3 Training with warm starts13.2.4 Training with frozen weights13.3 Learning with fewer labels13.4 Pretraining with text13.4.1 Transformers with the Hugging Face library13.4.2 Freezing weights with no-gradExercisesSummary
14 Advanced building blocks
14.1 Problems with pooling14.1.1 Aliasing compromises translation invariance14.1.2 Anti-aliasing by blurring14.1.3 Applying anti-aliased pooling14.2 Improved residual blocks14.2.1 Effective depth14.2.2 Implementing ReZero14.3 MixUp training reduces overfitting14.3.1 Picking the mix rate14.3.2 Implementing MixUpExercisesSummary
Appendix. Setting up Colab
A.1 Creating a Colab sessionAdding a GPUTesting your GPU
Index
inside back cover

Overview

Journey through the theory and practice of modern deep learning, and apply innovative techniques to solve everyday data problems.

In Inside Deep Learning, you will learn how to:

Implement deep learning with PyTorch
Select the right deep learning components
Train and evaluate a deep learning model
Fine tune deep learning models to maximize performance
Understand deep learning terminology
Adapt existing PyTorch code to solve new problems

Inside Deep Learning is an accessible guide to implementing deep learning with the PyTorch framework. It demystifies complex deep learning concepts and teaches you to understand the vocabulary of deep learning so you can keep pace in a rapidly evolving field. No detail is skipped—you’ll dive into math, theory, and practical applications. Everything is clearly explained in plain English.

About the Technology
Deep learning doesn’t have to be a black box! Knowing how your models and algorithms actually work gives you greater control over your results. And you don’t have to be a mathematics expert or a senior data scientist to grasp what’s going on inside a deep learning system. This book gives you the practical insight you need to understand and explain your work with confidence.

About the Book
Inside Deep Learning illuminates the inner workings of deep learning algorithms in a way that even machine learning novices can understand. You’ll explore deep learning concepts and tools through plain language explanations, annotated code, and dozens of instantly useful PyTorch examples. Each type of neural network is clearly presented without complex math, and every solution in this book can run using readily available GPU hardware!

What's Inside

Select the right deep learning components
Train and evaluate a deep learning model
Fine tune deep learning models to maximize performance
Understand deep learning terminology

About the Reader
For Python programmers with basic machine learning skills.

About the Author
Edward Raff is a Chief Scientist at Booz Allen Hamilton, and the author of the JSAT machine learning library.

Quotes
Pick up this book, and you won’t be able to put it down. A rich, engaging knowledge base of deep learning math, algorithms, and models—just like the title says!
- From the Foreword by Kirk Borne Ph.D., Chief Science Officer, DataPrime.ai

The clearest and easiest book for learning deep learning principles and techniques I have ever read. The graphical representations for the algorithms are an eye-opening revelation.
- Richard Vaughan, Purple Monkey Collective

A great read for anyone interested in understanding the details of deep learning.
- Vishwesh Ravi Shrimali, MBRDI

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781617298639Publisher Support Publisher Website Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills