Deep Learning from Scratch

Book description

With the resurgence of neural networks in the 2010s, deep learning has become essential for machine learning practitioners and even many software engineers. This book provides a comprehensive introduction for data scientists and software engineers with machine learning experience. You’ll start with deep learning basics and move quickly to the details of important advanced architectures, implementing everything from scratch along the way.

Author Seth Weidman shows you how neural networks work using a first principles approach. You’ll learn how to apply multilayer neural networks, convolutional neural networks, and recurrent neural networks from the ground up. With a thorough understanding of how neural networks work mathematically, computationally, and conceptually, you’ll be set up for success on all future deep learning projects.

This book provides:

  • Extremely clear and thorough mental models—accompanied by working code examples and mathematical explanations—for understanding neural networks
  • Methods for implementing multilayer neural networks from scratch, using an easy-to-understand object-oriented framework
  • Working implementations and clear-cut explanations of convolutional and recurrent neural networks
  • Implementation of these neural network concepts using the popular PyTorch framework

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Understanding Neural Networks Requires Multiple Mental Models
    2. Chapter Outlines
    3. Conventions Used in This Book
    4. Using Code Examples
    5. O’Reilly Online Learning
    6. How to Contact Us
    7. Acknowledgments
  2. 1. Foundations
    1. Functions
      1. Math
      2. Diagrams
      3. Code
    2. Derivatives
      1. Math
      2. Diagrams
      3. Code
    3. Nested Functions
      1. Diagram
      2. Math
      3. Code
      4. Another Diagram
    4. The Chain Rule
      1. Math
      2. Code
    5. A Slightly Longer Example
      1. Math
      2. Diagram
      3. Code
    6. Functions with Multiple Inputs
      1. Math
      2. Diagram
      3. Code
    7. Derivatives of Functions with Multiple Inputs
      1. Diagram
      2. Math
      3. Code
    8. Functions with Multiple Vector Inputs
      1. Math
    9. Creating New Features from Existing Features
      1. Math
      2. Diagram
      3. Code
    10. Derivatives of Functions with Multiple Vector Inputs
      1. Diagram
      2. Math
      3. Code
    11. Vector Functions and Their Derivatives: One Step Further
      1. Diagram
      2. Math
      3. Code
      4. Vector Functions and Their Derivatives: The Backward Pass
    12. Computational Graph with Two 2D Matrix Inputs
      1. Math
      2. Diagram
      3. Code
    13. The Fun Part: The Backward Pass
      1. Diagram
      2. Math
      3. Code
    14. Conclusion
  3. 2. Fundamentals
    1. Supervised Learning Overview
    2. Supervised Learning Models
    3. Linear Regression
      1. Linear Regression: A Diagram
      2. Linear Regression: A More Helpful Diagram (and the Math)
      3. Adding in the Intercept
      4. Linear Regression: The Code
    4. Training the Model
      1. Calculating the Gradients: A Diagram
      2. Calculating the Gradients: The Math (and Some Code)
      3. Calculating the Gradients: The (Full) Code
      4. Using These Gradients to Train the Model
    5. Assessing Our Model: Training Set Versus Testing Set
    6. Assessing Our Model: The Code
      1. Analyzing the Most Important Feature
    7. Neural Networks from Scratch
      1. Step 1: A Bunch of Linear Regressions
      2. Step 2: A Nonlinear Function
      3. Step 3: Another Linear Regression
      4. Diagrams
      5. Code
      6. Neural Networks: The Backward Pass
    8. Training and Assessing Our First Neural Network
      1. Two Reasons Why This Is Happening
    9. Conclusion
  4. 3. Deep Learning from Scratch
    1. Deep Learning Definition: A First Pass
    2. The Building Blocks of Neural Networks: Operations
      1. Diagram
      2. Code
    3. The Building Blocks of Neural Networks: Layers
      1. Diagrams
    4. Building Blocks on Building Blocks
      1. The Layer Blueprint
      2. The Dense Layer
    5. The NeuralNetwork Class, and Maybe Others
      1. Diagram
      2. Code
      3. Loss Class
    6. Deep Learning from Scratch
      1. Implementing Batch Training
      2. NeuralNetwork: Code
    7. Trainer and Optimizer
      1. Optimizer
      2. Trainer
    8. Putting Everything Together
      1. Our First Deep Learning Model (from Scratch)
    9. Conclusion and Next Steps
  5. 4. Extensions
    1. Some Intuition About Neural Networks
    2. The Softmax Cross Entropy Loss Function
      1. Component #1: The Softmax Function
      2. Component #2: The Cross Entropy Loss
      3. A Note on Activation Functions
    3. Experiments
      1. Data Preprocessing
      2. Model
      3. Experiment: Softmax Cross Entropy Loss
    4. Momentum
      1. Intuition for Momentum
      2. Implementing Momentum in the Optimizer Class
      3. Experiment: Stochastic Gradient Descent with Momentum
    5. Learning Rate Decay
      1. Types of Learning Rate Decay
      2. Experiments: Learning Rate Decay
    6. Weight Initialization
      1. Math and Code
      2. Experiments: Weight Initialization
    7. Dropout
      1. Definition
      2. Implementation
      3. Experiments: Dropout
    8. Conclusion
  6. 5. Convolutional Neural Networks
    1. Neural Networks and Representation Learning
      1. A Different Architecture for Image Data
      2. The Convolution Operation
      3. The Multichannel Convolution Operation
    2. Convolutional Layers
      1. Implementation Implications
      2. The Differences Between Convolutional and Fully Connected Layers
      3. Making Predictions with Convolutional Layers: The Flatten Layer
      4. Pooling Layers
    3. Implementing the Multichannel Convolution Operation
      1. The Forward Pass
      2. Convolutions: The Backward Pass
      3. Batches, 2D Convolutions, and Multiple Channels
      4. 2D Convolutions
      5. The Last Element: Adding “Channels”
    4. Using This Operation to Train a CNN
      1. The Flatten Operation
      2. The Full Conv2D Layer
      3. Experiments
    5. Conclusion
  7. 6. Recurrent Neural Networks
    1. The Key Limitation: Handling Branching
    2. Automatic Differentiation
      1. Coding Up Gradient Accumulation
    3. Motivation for Recurrent Neural Networks
    4. Introduction to Recurrent Neural Networks
      1. The First Class for RNNs: RNNLayer
      2. The Second Class for RNNs: RNNNode
      3. Putting These Two Classes Together
      4. The Backward Pass
    5. RNNs: The Code
      1. The RNNLayer Class
      2. The Essential Elements of RNNNodes
      3. “Vanilla” RNNNodes
      4. Limitations of “Vanilla” RNNNodes
      5. One Solution: GRUNodes
      6. LSTMNodes
      7. Data Representation for a Character-Level RNN-Based Language Model
      8. Other Language Modeling Tasks
      9. Combining RNNLayer Variants
      10. Putting This All Together
    6. Conclusion
  8. 7. PyTorch
    1. PyTorch Tensors
    2. Deep Learning with PyTorch
      1. PyTorch Elements: Model, Layer, Optimizer, and Loss
      2. Implementing Neural Network Building Blocks Using PyTorch: DenseLayer
      3. Example: Boston Housing Prices Model in PyTorch
      4. PyTorch Elements: Optimizer and Loss
      5. PyTorch Elements: Trainer
      6. Tricks to Optimize Learning in PyTorch
    3. Convolutional Neural Networks in PyTorch
      1. DataLoader and Transforms
      2. LSTMs in PyTorch
    4. Postscript: Unsupervised Learning via Autoencoders
      1. Representation Learning
      2. An Approach for Situations with No Labels Whatsoever
      3. Implementing an Autoencoder in PyTorch
      4. A Stronger Test for Unsupervised Learning, and a Solution
    5. Conclusion
  9. A. Deep Dives
    1. Matrix Chain Rule
    2. Gradient of the Loss with Respect to the Bias Terms
    3. Convolutions via Matrix Multiplication
  10. Index

Product information

  • Title: Deep Learning from Scratch
  • Author(s): Seth Weidman
  • Release date: September 2019
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492041412