# Chapter 2. Fundamentals

In Chapter 1, I described the major conceptual building block for understanding deep learning: nested, continuous, differentiable functions. I showed how to represent these functions as computational graphs, with each node in a graph representing a single, simple function. In particular, I demonstrated that such a representation showed easily how to calculate the derivative of the output of the nested function with respect to its input: we simply take the derivatives of all the constituent functions, evaluate these derivatives at the input that these functions received, and then multiply all of the results together; this will result in a correct derivative for the nested function because of the chain rule. I illustrated that this does in fact work with some simple examples, with functions that took NumPy’s `ndarray`

s as inputs and produced `ndarray`

s as outputs.

I showed that this method of computing derivatives works even when the function takes in multiple `ndarray`

s as inputs and combines them via a *matrix multiplication* operation, which, unlike the other operations we saw, changes the shape of its inputs. Specifically, if one input to this operation—call the input *X*—is a B × N `ndarray`

, and another input to this operation, *W*, is an N × M `ndarray`

, then its output *P* is a B × M `ndarray`

. While it isn’t clear what the derivative of such an operation would be, I showed that when a matrix multiplication *ν*(*X, W*) is included as a “constituent operation” in a nested ...

Get *Deep Learning from Scratch* now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.