book

Deep Learning from Scratch

Name: Deep Learning from Scratch
Author: Seth Weidman
ISBN: 9781492041412

by Seth Weidman

September 2019

Intermediate to advanced

250 pages

6h 58m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Understanding Neural Networks Requires Multiple Mental ModelsChapter OutlinesConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Foundations
FunctionsMathDiagramsCodeDerivativesMathDiagramsCodeNested FunctionsDiagramMathCodeAnother DiagramThe Chain RuleMathCodeA Slightly Longer ExampleMathDiagramCodeFunctions with Multiple InputsMathDiagramCodeDerivatives of Functions with Multiple InputsDiagramMathCodeFunctions with Multiple Vector InputsMathCreating New Features from Existing FeaturesMathDiagramCodeDerivatives of Functions with Multiple Vector InputsDiagramMathCodeVector Functions and Their Derivatives: One Step FurtherDiagramMathCodeVector Functions and Their Derivatives: The Backward PassComputational Graph with Two 2D Matrix InputsMathDiagramCodeThe Fun Part: The Backward PassDiagramMathCodeConclusion
2. Fundamentals
Supervised Learning OverviewSupervised Learning ModelsLinear RegressionLinear Regression: A DiagramLinear Regression: A More Helpful Diagram (and the Math)Adding in the InterceptLinear Regression: The CodeTraining the ModelCalculating the Gradients: A DiagramCalculating the Gradients: The Math (and Some Code)Calculating the Gradients: The (Full) CodeUsing These Gradients to Train the ModelAssessing Our Model: Training Set Versus Testing SetAssessing Our Model: The CodeAnalyzing the Most Important FeatureNeural Networks from ScratchStep 1: A Bunch of Linear RegressionsStep 2: A Nonlinear FunctionStep 3: Another Linear RegressionDiagramsCodeNeural Networks: The Backward PassTraining and Assessing Our First Neural NetworkTwo Reasons Why This Is HappeningConclusion
3. Deep Learning from Scratch
Deep Learning Definition: A First PassThe Building Blocks of Neural Networks: OperationsDiagramCodeThe Building Blocks of Neural Networks: LayersDiagramsBuilding Blocks on Building BlocksThe Layer BlueprintThe Dense LayerThe NeuralNetwork Class, and Maybe OthersDiagramCodeLoss ClassDeep Learning from ScratchImplementing Batch TrainingNeuralNetwork: CodeTrainer and OptimizerOptimizerTrainerPutting Everything TogetherOur First Deep Learning Model (from Scratch)Conclusion and Next Steps
4. Extensions
Some Intuition About Neural NetworksThe Softmax Cross Entropy Loss FunctionComponent #1: The Softmax FunctionComponent #2: The Cross Entropy LossA Note on Activation FunctionsExperimentsData PreprocessingModelExperiment: Softmax Cross Entropy LossMomentumIntuition for MomentumImplementing Momentum in the Optimizer ClassExperiment: Stochastic Gradient Descent with MomentumLearning Rate DecayTypes of Learning Rate DecayExperiments: Learning Rate DecayWeight InitializationMath and CodeExperiments: Weight InitializationDropoutDefinitionImplementationExperiments: DropoutConclusion
5. Convolutional Neural Networks
Neural Networks and Representation LearningA Different Architecture for Image DataThe Convolution OperationThe Multichannel Convolution OperationConvolutional LayersImplementation ImplicationsThe Differences Between Convolutional and Fully Connected LayersMaking Predictions with Convolutional Layers: The Flatten LayerPooling LayersImplementing the Multichannel Convolution OperationThe Forward PassConvolutions: The Backward PassBatches, 2D Convolutions, and Multiple Channels2D ConvolutionsThe Last Element: Adding “Channels”Using This Operation to Train a CNNThe Flatten OperationThe Full Conv2D LayerExperimentsConclusion
6. Recurrent Neural Networks
The Key Limitation: Handling BranchingAutomatic DifferentiationCoding Up Gradient AccumulationMotivation for Recurrent Neural NetworksIntroduction to Recurrent Neural NetworksThe First Class for RNNs: RNNLayerThe Second Class for RNNs: RNNNodePutting These Two Classes TogetherThe Backward PassRNNs: The CodeThe RNNLayer ClassThe Essential Elements of RNNNodes“Vanilla” RNNNodesLimitations of “Vanilla” RNNNodesOne Solution: GRUNodesLSTMNodesData Representation for a Character-Level RNN-Based Language ModelOther Language Modeling TasksCombining RNNLayer VariantsPutting This All TogetherConclusion
7. PyTorch
PyTorch TensorsDeep Learning with PyTorchPyTorch Elements: Model, Layer, Optimizer, and LossImplementing Neural Network Building Blocks Using PyTorch: DenseLayerExample: Boston Housing Prices Model in PyTorchPyTorch Elements: Optimizer and LossPyTorch Elements: TrainerTricks to Optimize Learning in PyTorchConvolutional Neural Networks in PyTorchDataLoader and TransformsLSTMs in PyTorchPostscript: Unsupervised Learning via AutoencodersRepresentation LearningAn Approach for Situations with No Labels WhatsoeverImplementing an Autoencoder in PyTorchA Stronger Test for Unsupervised Learning, and a SolutionConclusion
A. Deep Dives
Matrix Chain RuleGradient of the Loss with Respect to the Bias TermsConvolutions via Matrix Multiplication
Index

Content preview from Deep Learning from Scratch

Chapter 3. Deep Learning from Scratch

You may not realize it, but you now have all the mathematical and conceptual foundations to answer the key questions about deep learning models that I posed at the beginning of the book: you understand how neural networks work—the computations involved with the matrix multiplications, the loss, and the partial derivatives with respect to that loss—as well as why those computations work (namely, the chain rule from calculus). We achieved this understanding by building neural networks from first principles, representing them as a series of “building blocks” where each building block was a single mathematical function. In this chapter, you’ll learn to represent these building blocks themselves as abstract Python classes and then use these classes to build deep learning models; by the end of this chapter, you will indeed have done “deep learning from scratch”!

We’ll also map the descriptions of neural networks in terms of these building blocks to more conventional descriptions of deep learning models that you may have heard before. For example, by the end of this chapter, you’ll know what it means for a deep learning model to have “multiple hidden layers.” This is really the essence of understanding a concept: being able to translate between high-level descriptions and low-level details of what is actually going on. Let’s begin building toward this translation. So far, we’ve described models just in terms of the operations that happen at a low ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492041405Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Deep Learning from Scratch

by Seth Weidman

Chapter 3. Deep Learning from Scratch

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.