Grokking Deep Learning

Book description

Grokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks.



About the Technology

Deep learning, a branch of artificial intelligence, teaches computers to learn by using neural networks, technology inspired by the human brain. Online text translation, self-driving cars, personalized product recommendations, and virtual voice assistants are just a few of the exciting modern advancements possible thanks to deep learning.



About the Book

Grokking Deep Learning teaches you to build deep learning neural networks from scratch! In his engaging style, seasoned deep learning expert Andrew Trask shows you the science under the hood, so you grok for yourself every detail of training neural networks. Using only Python and its math-supporting library, NumPy, you’ll train your own neural networks to see and understand images, translate text into different languages, and even write like Shakespeare! When you’re done, you’ll be fully prepared to move on to mastering deep learning frameworks.



What's Inside

  • The science behind deep learning
  • Building and training your own neural networks
  • Privacy concepts, including federated learning
  • Tips for continuing your pursuit of deep learning


About the Reader

For readers with high school-level math and intermediate programming skills.



About the Author

Andrew Trask is a PhD student at Oxford University and a research scientist at DeepMind. Previously, Andrew was a researcher and analytics product manager at Digital Reasoning, where he trained the world’s largest artificial neural network and helped guide the analytics roadmap for the Synthesys cognitive computing platform.

We interviewed Andrew as a part of our Six Questions series. Check it out here.



Quotes
An excellent introduction and overview of deep learning by a masterful teacher who guides, illuminates, and encourages you along the way.
- Kelvin D. Meeks, International Technology Ventures

All concepts are clearly explained with excellent visualizations and examples.
- Kalyan Reddy, ArisGlobal

Excels at navigating the reader through the introductory details of deep learning in a simple and intuitive way.
- Eremey Valetov, Lancaster University/Fermilab

Your step-by-step guide to learning AI.
- Ian Stirk, I and K Consulting

A complex topic simplified!
- Vipul Gupta, Microsoft

Table of contents

  1. Copyright
    1. Dedication
  2. Brief Table of Contents
  3. Table of Contents
  4. Preface
  5. Acknowledgments
  6. About this book
    1. Who should read this book
    2. Roadmap
    3. About the Code conventions and downloads
    4. Book forum
  7. About the author
  8. Chapter 1. Introducing deep learning: why you should learn it
    1. Welcome to Grokking Deep Learning
    2. Why you should learn deep learning
    3. Will this be difficult to learn?
    4. Why you should read this book
    5. What you need to get started
    6. You’ll probably need some Python knowledge
    7. Summary
  9. Chapter 2. Fundamental concepts: how do machines learn?
    1. What is deep learning?
    2. What is machine learning?
    3. Supervised machine learning
    4. Unsupervised machine learning
    5. Parametric vs. nonparametric learning
    6. Supervised parametric learning
    7. Unsupervised parametric learning
    8. Nonparametric learning
    9. Summary
  10. Chapter 3. Introduction to neural prediction: forward propagation
    1. Step 1: Predict
    2. A simple neural network making a prediction
    3. What is a neural network?
    4. What does this neural network do?
    5. Making a prediction with multiple inputs
    6. Multiple inputs: What does this neural network do?
    7. Multiple inputs: Complete runnable code
    8. Making a prediction with multiple outputs
    9. Predicting with multiple inputs and outputs
    10. Multiple inputs and outputs: How does it work?
    11. Predicting on predictions
    12. A quick primer on NumPy
    13. Summary
  11. Chapter 4. Introduction to neural learning: gradient descent
    1. Predict, compare, and learn
    2. Compare
    3. Learn
    4. Compare: Does your network make good predictions?
    5. Why measure error?
    6. What’s the simplest form of neural learning?
    7. Hot and cold learning
    8. Characteristics of hot and cold learning
    9. Calculating both direction and amount from error
    10. One iteration of gradient descent
    11. Learning is just reducing error
    12. Let’s watch several steps of learning
    13. Why does this work? What is weight_delta, really?
    14. Tunnel vision on one concept
    15. A box with rods poking out of it
    16. Derivatives: Take two
    17. What you really need to know
    18. What you don’t really need to know
    19. How to use a derivative to learn
    20. Look familiar?
    21. Breaking gradient descent
    22. Visualizing the overcorrections
    23. Divergence
    24. Introducing alpha
    25. Alpha in code
    26. Memorizing
  12. Chapter 5. Learning multiple weights at a time: generalizing gradient descent
    1. Gradient descent learning with multiple inputs
    2. Gradient descent with multiple inputs explained
    3. Let’s watch several steps of learning
    4. Freezing one weight: What does it do?
    5. Gradient descent learning with multiple outputs
    6. Gradient descent with multiple inputs and outputs
    7. What do these weights learn?
    8. Visualizing weight values
    9. Visualizing dot products (weighted sums)
    10. Summary
  13. Chapter 6. Building your first deep neural network: introduction to backpropagation
    1. The streetlight problem
    2. Preparing the data
    3. Matrices and the matrix relationship
    4. Creating a matrix or two in Python
    5. Building a neural network
    6. Learning the whole dataset
    7. Full, batch, and stochastic gradient descent
    8. Neural networks learn correlation
    9. Up and down pressure
    10. Edge case: Overfitting
    11. Edge case: Conflicting pressure
    12. Learning indirect correlation
    13. Creating correlation
    14. Stacking neural networks: A review
    15. Backpropagation: Long-distance error attribution
    16. Backpropagation: Why does this work?
    17. Linear vs. nonlinear
    18. Why the neural network still doesn’t work
    19. The secret to sometimes correlation
    20. A quick break
    21. Your first deep neural network
    22. Backpropagation in code
    23. One iteration of backpropagation
    24. Putting it all together
    25. Why do deep networks matter?
  14. Chapter 7. How to picture neural networks: in your head and on paper
    1. It’s time to simplify
    2. Correlation summarization
    3. The previously overcomplicated visualization
    4. The simplified visualization
    5. Simplifying even further
    6. Let’s see this network predict
    7. Visualizing using letters instead of pictures
    8. Linking the variables
    9. Everything side by side
    10. The importance of visualization tools
  15. Chapter 8. Learning signal and ignoring noise: introduction to regularization and batching
    1. Three-layer network on MNIST
    2. Well, that was easy
    3. Memorization vs. generalization
    4. Overfitting in neural networks
    5. Where overfitting comes from
    6. The simplest regularization: Early stopping
    7. Industry standard regularization: Dropout
    8. Why dropout works: Ensembling works
    9. Dropout in code
    10. Dropout evaluated on MNIST
    11. Batch gradient descent
    12. Summary
  16. Chapter 9. Modeling probabilities and nonlinearities: activation functions
    1. What is an activation function?
    2. Standard hidden-layer activation functions
    3. Standard output layer activation functions
    4. The core issue: Inputs have similarity
    5. softmax computation
    6. Activation installation instructions
    7. Multiplying delta by the slope
    8. Converting output to slope (derivative)
    9. Upgrading the MNIST network
  17. Chapter 10. Neural learning about edges and corners: intro to convolutional neural networks
    1. Reusing weights in multiple places
    2. The convolutional layer
    3. A simple implementation in NumPy
    4. Summary
  18. Chapter 11. Neural networks that understand language: king – man + woman == ?
    1. What does it mean to understand language?
    2. Natural language processing (NLP)
    3. Supervised NLP
    4. IMDB movie reviews dataset
    5. Capturing word correlation in input data
    6. Predicting movie reviews
    7. Intro to an embedding layer
    8. Interpreting the output
    9. Neural architecture
    10. Comparing word embeddings
    11. What is the meaning of a neuron?
    12. Filling in the blank
    13. Meaning is derived from loss
    14. King – Man + Woman ~= Queen
    15. Word analogies
    16. Summary
  19. Chapter 12. Neural networks that write like Shakespeare: recurrent layers for variable-length data
    1. The challenge of arbitrary length
    2. Do comparisons really matter?
    3. The surprising power of averaged word vectors
    4. How is information stored in these embeddings?
    5. How does a neural network use embeddings?
    6. The limitations of bag-of-words vectors
    7. Using identity vectors to sum word embeddings
    8. Matrices that change absolutely nothing
    9. Learning the transition matrices
    10. Learning to create useful sentence vectors
    11. Forward propagation in Python
    12. How do you backpropagate into this?
    13. Let’s train it!
    14. Setting things up
    15. Forward propagation with arbitrary length
    16. Backpropagation with arbitrary length
    17. Weight update with arbitrary length
    18. Execution and output analysis
    19. Summary
  20. Chapter 13. Introducing automatic optimization: let’s build a deep learning framework
    1. What is a deep learning framework?
    2. Introduction to tensors
    3. Introduction to automatic gradient computation (autograd)
    4. A quick checkpoint
    5. Tensors that are used multiple times
    6. Upgrading autograd to support multiuse tensors
    7. How does addition backpropagation work?
    8. Adding support for negation
    9. Adding support for additional functions
    10. Using autograd to train a neural network
    11. Adding automatic optimization
    12. Adding support for layer types
    13. Layers that contain layers
    14. Loss-function layers
    15. How to learn a framework
    16. Nonlinearity layers
    17. The embedding layer
    18. Adding indexing to autograd
    19. The embedding layer (revisited)
    20. The cross-entropy layer
    21. The recurrent neural network layer
    22. Summary
  21. Chapter 14. Learning to write like Shakespeare: long short-term memory
    1. Character language modeling
    2. The need for truncated backpropagation
    3. Truncated backpropagation
    4. A sample of the output
    5. Vanishing and exploding gradients
    6. A toy example of RNN backpropagation
    7. Long short-term memory (LSTM) cells
    8. Some intuition about LSTM gates
    9. The long short-term memory layer
    10. Upgrading the character language model
    11. Training the LSTM character language model
    12. Tuning the LSTM character language model
    13. Summary
  22. Chapter 15. Deep learning on unseen data: introducing federated learning
    1. The problem of privacy in deep learning
    2. Federated learning
    3. Learning to detect spam
    4. Let’s make it federated
    5. Hacking into federated learning
    6. Secure aggregation
    7. Homomorphic encryption
    8. Homomorphically encrypted federated learning
    9. Summary
  23. Chapter 16. Where to go from here: a brief guide
    1. Congratulations!
    2. Step 1: Start learning PyTorch
    3. Step 2: Start another deep learning course
    4. Step 3: Grab a mathy deep learning textbook
    5. Step 4: Start a blog, and teach deep learning
    6. Step 5: Twitter
    7. Step 6: Implement academic papers
    8. Step 7: Acquire access to a GPU (or many)
    9. Step 8: Get paid to practice
    10. Step 9: Join an open source project
    11. Step 10: Develop your local community
  24. Index

Product information

  • Title: Grokking Deep Learning
  • Author(s): Andrew W. Trask
  • Release date: February 2019
  • Publisher(s): Manning Publications
  • ISBN: 9781617293702