book

Deep Learning from Scratch

Name: Deep Learning from Scratch
Author: Seth Weidman
ISBN: 9781492041412

by Seth Weidman

September 2019

Intermediate to advanced

250 pages

6h 58m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
Understanding Neural Networks Requires Multiple Mental ModelsChapter OutlinesConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Foundations
FunctionsMathDiagramsCodeDerivativesMathDiagramsCodeNested FunctionsDiagramMathCodeAnother DiagramThe Chain RuleMathCodeA Slightly Longer ExampleMathDiagramCodeFunctions with Multiple InputsMathDiagramCodeDerivatives of Functions with Multiple InputsDiagramMathCodeFunctions with Multiple Vector InputsMathCreating New Features from Existing FeaturesMathDiagramCodeDerivatives of Functions with Multiple Vector InputsDiagramMathCodeVector Functions and Their Derivatives: One Step FurtherDiagramMathCodeVector Functions and Their Derivatives: The Backward PassComputational Graph with Two 2D Matrix InputsMathDiagramCodeThe Fun Part: The Backward PassDiagramMathCodeConclusion
2. Fundamentals
Supervised Learning OverviewSupervised Learning ModelsLinear RegressionLinear Regression: A DiagramLinear Regression: A More Helpful Diagram (and the Math)Adding in the InterceptLinear Regression: The CodeTraining the ModelCalculating the Gradients: A DiagramCalculating the Gradients: The Math (and Some Code)Calculating the Gradients: The (Full) CodeUsing These Gradients to Train the ModelAssessing Our Model: Training Set Versus Testing SetAssessing Our Model: The CodeAnalyzing the Most Important FeatureNeural Networks from ScratchStep 1: A Bunch of Linear RegressionsStep 2: A Nonlinear FunctionStep 3: Another Linear RegressionDiagramsCodeNeural Networks: The Backward PassTraining and Assessing Our First Neural NetworkTwo Reasons Why This Is HappeningConclusion
3. Deep Learning from Scratch
Deep Learning Definition: A First PassThe Building Blocks of Neural Networks: OperationsDiagramCodeThe Building Blocks of Neural Networks: LayersDiagramsBuilding Blocks on Building BlocksThe Layer BlueprintThe Dense LayerThe NeuralNetwork Class, and Maybe OthersDiagramCodeLoss ClassDeep Learning from ScratchImplementing Batch TrainingNeuralNetwork: CodeTrainer and OptimizerOptimizerTrainerPutting Everything TogetherOur First Deep Learning Model (from Scratch)Conclusion and Next Steps
4. Extensions
Some Intuition About Neural NetworksThe Softmax Cross Entropy Loss FunctionComponent #1: The Softmax FunctionComponent #2: The Cross Entropy LossA Note on Activation FunctionsExperimentsData PreprocessingModelExperiment: Softmax Cross Entropy LossMomentumIntuition for MomentumImplementing Momentum in the Optimizer ClassExperiment: Stochastic Gradient Descent with MomentumLearning Rate DecayTypes of Learning Rate DecayExperiments: Learning Rate DecayWeight InitializationMath and CodeExperiments: Weight InitializationDropoutDefinitionImplementationExperiments: DropoutConclusion
5. Convolutional Neural Networks
Neural Networks and Representation LearningA Different Architecture for Image DataThe Convolution OperationThe Multichannel Convolution OperationConvolutional LayersImplementation ImplicationsThe Differences Between Convolutional and Fully Connected LayersMaking Predictions with Convolutional Layers: The Flatten LayerPooling LayersImplementing the Multichannel Convolution OperationThe Forward PassConvolutions: The Backward PassBatches, 2D Convolutions, and Multiple Channels2D ConvolutionsThe Last Element: Adding “Channels”Using This Operation to Train a CNNThe Flatten OperationThe Full Conv2D LayerExperimentsConclusion
6. Recurrent Neural Networks
The Key Limitation: Handling BranchingAutomatic DifferentiationCoding Up Gradient AccumulationMotivation for Recurrent Neural NetworksIntroduction to Recurrent Neural NetworksThe First Class for RNNs: RNNLayerThe Second Class for RNNs: RNNNodePutting These Two Classes TogetherThe Backward PassRNNs: The CodeThe RNNLayer ClassThe Essential Elements of RNNNodes“Vanilla” RNNNodesLimitations of “Vanilla” RNNNodesOne Solution: GRUNodesLSTMNodesData Representation for a Character-Level RNN-Based Language ModelOther Language Modeling TasksCombining RNNLayer VariantsPutting This All TogetherConclusion
7. PyTorch
PyTorch TensorsDeep Learning with PyTorchPyTorch Elements: Model, Layer, Optimizer, and LossImplementing Neural Network Building Blocks Using PyTorch: DenseLayerExample: Boston Housing Prices Model in PyTorchPyTorch Elements: Optimizer and LossPyTorch Elements: TrainerTricks to Optimize Learning in PyTorchConvolutional Neural Networks in PyTorchDataLoader and TransformsLSTMs in PyTorchPostscript: Unsupervised Learning via AutoencodersRepresentation LearningAn Approach for Situations with No Labels WhatsoeverImplementing an Autoencoder in PyTorchA Stronger Test for Unsupervised Learning, and a SolutionConclusion
A. Deep Dives
Matrix Chain RuleGradient of the Loss with Respect to the Bias TermsConvolutions via Matrix Multiplication
Index

Content preview from Deep Learning from Scratch

Chapter 5. Convolutional Neural Networks

In this chapter, we’ll cover convolutional neural networks (CNNs). CNNs are the standard neural network architecture used for prediction when the input observations are images, which is the case in a wide range of neural network applications. So far in the book, we’ve focused exclusively on fully connected neural networks, which we implemented as a series of Dense layers. Thus, we’ll start this chapter by reviewing some key elements of these networks and use this to motivate why we might want to use a different architecture for images. We’ll then cover CNNs in a manner similar to that in which we introduced other concepts in this book: we’ll first discuss how they work at a high level, then move to discussing them at a lower level, and finally show in detail how they work by coding up the convolution operation from scratch.¹ By the end of this chapter, you’ll have a thorough enough understanding of how CNNs work to be able to use them both to solve problems and to learn about advanced CNN variants, such as ResNets, DenseNets, and Octave Convolutions on your own.

Neural Networks and Representation Learning

Neural networks initially receive data on observations, with each observation represented by some number n features. So far we’ve seen two examples of this in two very different domains: the first was the house prices dataset, where each observation was made up of 13 features, each of which represented a numeric characteristic about that ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781492041405Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Deep Learning from Scratch

by Seth Weidman

Chapter 5. Convolutional Neural Networks

Neural Networks and Representation Learning

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.