book

Probabilistic Deep Learning

Name: Probabilistic Deep Learning
ISBN: 9781617296079

by Beate Sick, Oliver Duerr, Elvis Murina

November 2020

Intermediate to advanced

296 pages

9h 8m

English

Manning Publications

Read now

Unlock full access

Probabilistic Deep Learning
Copyright
brief contents
contents
front matter
prefaceacknowledgmentsabout this bookWho should read this bookHow this book is organized: A roadmapAbout the codeliveBook discussion forumabout the authorsabout the cover illustration
Part 1. Basics of deep learning
1 Introduction to probabilistic deep learning
1.1 A first look at probabilistic models1.2 A first brief look at deep learning (DL)1.2.1 A success story1.3 Classification1.3.1 Traditional approach to image classification1.3.2 Deep learning approach to image classification1.3.3 Non-probabilistic classification1.3.4 Probabilistic classification1.3.5 Bayesian probabilistic classification1.4 Curve fitting1.4.1 Non-probabilistic curve fitting1.4.2 Probabilistic curve fitting1.4.3 Bayesian probabilistic curve fitting1.5 When to use and when not to use DL?1.5.1 When not to use DL1.5.2 When to use DL1.5.3 When to use and when not to use probabilistic models?1.6 What you’ll learn in this bookSummary
2 Neural network architectures
2.1 Fully connected neural networks (fcNNs)2.1.1 The biology that inspired the design of artificial NNs2.1.2 Getting started with implementing an NN2.1.3 Using a fully connected NN (fcNN) to classify images2.2 Convolutional NNs for image-like data2.2.1 Main ideas in a CNN architecture2.2.2 A minimal CNN for edge lovers2.2.3 Biological inspiration for a CNN architecture2.2.4 Building and understanding a CNN2.3 One-dimensional CNNs for ordered data2.3.1 Format of time-ordered data2.3.2 What’s special about ordered data?2.3.3 Architectures for time-ordered dataSummary
3 Principles of curve fitting
3.1 “Hello world” in curve fitting3.1.1 Fitting a linear regression model based on a loss function3.2 Gradient descent method3.2.1 Loss with one free model parameter3.2.2 Loss with two free model parameters3.3 Special DL sauce3.3.1 Mini-batch gradient descent3.3.2 Using SGD variants to speed up the learning3.3.3 Automatic differentiation3.4 Backpropagation in DL frameworks3.4.1 Static graph frameworks3.4.2 Dynamic graph frameworksSummary
Part 2. Maximum likelihood approaches for probabilistic DL models

4 Building loss functions with the likelihood approach
4.1 Introduction to the MaxLike principle: The mother of all loss functions4.2 Deriving a loss function for a classification problem4.2.1 Binary classification problem4.2.2 Classification problems with more than two classes4.2.3 Relationship between NLL, cross entropy, and Kullback-Leibler divergence4.3 Deriving a loss function for regression problems4.3.1 Using a NN without hidden layers and one output neuron for modeling a linear relationship between input and output4.3.2 Using a NN with hidden layers to model non-linear relationships between input and output4.3.3 Using an NN with additional output for regression tasks with nonconstant varianceSummary
5 Probabilistic deep learning models with TensorFlow Probability
5.1 Evaluating and comparing different probabilistic prediction models5.2 Introducing TensorFlow Probability (TFP)5.3 Modeling continuous data with TFP5.3.1 Fitting and evaluating a linear regression model with constant variance5.3.2 Fitting and evaluating a linear regression model with a nonconstant standard deviation5.4 Modeling count data with TensorFlow Probability5.4.1 The Poisson distribution for count data5.4.2 Extending the Poisson distribution to a zero-inflated Poisson (ZIP) distributionSummary
6 Probabilistic deep learning models in the wild
6.1 Flexible probability distributions in state-of-the-art DL models6.1.1 Multinomial distribution as a flexible distribution6.1.2 Making sense of discretized logistic mixture6.2 Case study: Bavarian roadkills6.3 Go with the flow: Introduction to normalizing flows (NFs)6.3.1 The principle idea of NFs6.3.2 The change of variable technique for probabilities6.3.3 Fitting an NF to data6.3.4 Going deeper by chaining flows6.3.5 Transformation between higher dimensional spaces*6.3.6 Using networks to control flows6.3.7 Fun with flows: Sampling facesSummary
Part 3. Bayesian approaches for probabilistic DL models
7 Bayesian learning
7.1 What’s wrong with non-Bayesian DL: The elephant in the room7.2 The first encounter with a Bayesian approach7.2.1 Bayesian model: The hacker’s way7.2.2 What did we just do?7.3 The Bayesian approach for probabilistic models7.3.1 Training and prediction with a Bayesian model7.3.2 A coin toss as a Hello World example for Bayesian models7.3.3 Revisiting the Bayesian linear regression modelSummary
8 Bayesian neural networks
8.1 Bayesian neural networks (BNNs)8.2 Variational inference (VI) as an approximative Bayes approach8.2.1 Looking under the hood of VI*8.2.2 Applying VI to the toy problem*8.3 Variational inference with TensorFlow Probability8.4 MC dropout as an approximate Bayes approach8.4.1 Classical dropout used during training8.4.2 MC dropout used during train and test times8.5 Case studies8.5.1 Regression case study on extrapolation8.5.2 Classification case study with novel classesSummary
Glossary of terms and abbreviations
index

Content preview from Probabilistic Deep Learning

4 Building loss functions with the likelihood approach

This chapter covers

Using the maximum likelihood approach for estimating model parameters
Determining a loss function for classification problems
Determining a loss function for regression problems

In the last chapter, you saw how you can determine parameter values through optimizing a loss function using stochastic gradient descent (SGD). This approach also works for DL models that have millions of parameters. But how did we arrive at the loss function? In the linear regression problem (see sections 1.4 and 3.1), we used the mean squared error (MSE) as a loss function. We don’t claim ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781617296079Publisher Support Publisher Website Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Probabilistic Deep Learning

by Beate Sick, Oliver Duerr, Elvis Murina

4 Building loss functions with the likelihood approach

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.