Chapter 1. Foundations
Don’t memorize these formulas. If you understand the concepts, you can invent your own notation.
John Cochrane, Investments Notes 2006
The aim of this chapter is to explain some foundational mental models that are essential for understanding how neural networks work. Specifically, we’ll cover nested mathematical functions and their derivatives. We’ll work our way up from the simplest possible building blocks to show that we can build complicated functions made up of a “chain” of constituent functions and, even when one of these functions is a matrix multiplication that takes in multiple inputs, compute the derivative of the functions’ outputs with respect to their inputs. Understanding how this process works will be essential to understanding neural networks, which we technically won’t begin to cover until Chapter 2.
As we’re getting our bearings around these foundational building blocks of neural networks, we’ll systematically describe each concept we introduce from three perspectives:
-
Math, in the form of an equation or equations
-
Code, with as little extra syntax as possible (making Python an ideal choice)
-
A diagram explaining what is going on, of the kind you would draw on a whiteboard during a coding interview
As mentioned in the preface, one of the challenges of understanding neural networks is that it requires multiple mental models. We’ll get a sense of that in this chapter: each of these three perspectives excludes certain essential features ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access