Errata
The errata list is a list of errors and their corrections that were found after the product was released.
The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.
Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update
Version | Location | Description | Submitted by | Date submitted |
---|---|---|---|---|
web Figure 1.1 |
The function in figure 1.1 is not Relu but Leaky Relu. |
Jaap van der Does | Oct 16, 2019 | |
1. Foundation, The Chain Rule, first formula | The formula "df2(x)/du = df2(f1(x))/du * df1(x)/du" is correct? I tihnk is it should be "df1f2(x)/du = df2(f1(x))/du * df1(x)/du". |
Hiroki Nishimoto | Oct 23, 2019 | |
Chap 1 Figure 1.18 |
Shouldn't the symbol in the second blue box should be a \sigma and not \delta? |
Venkatesh-Prasad Ranganath | Nov 10, 2019 | |
Chapter 1 Figure 1.1 |
The figure says ReLU function, but instead plots Leaky ReLU |
Tamirlan Seidakhmetov | Jan 05, 2020 | |
Chaptet 1 "The Fun Part: The Backward Pass" -> "Code" section |
Here "then increasing x11 by 0.001 should increase L by 0.01 × 0.2489", 0.01 should be changed to 0.001 |
Tamirlan Seidakhmetov | Jan 07, 2020 | |
Chapter 2 Linear Regression: The Code |
Inside forward_linear_regression function, loss formula is incorrect. |
Tamirlan Seidakhmetov | Jan 14, 2020 | |
Printed | Page p. 84 2nd to last paragraph |
default activation is said to be "Linear", but in the code snippet it is actually "Sigmoid". So in the code snippet on p.91, the linear_regression neural network would need an explicit assignment of the activation to Linear(), otherwise Sigmoid() would be used. |
Anonymous | Oct 27, 2020 |
ePub | Page ePub does not give page number 1. Foundations, "The Fun Part: The Backward Pass", Code: Now let's verify that everything worked |
How can we verify L is correct when W is not given? W is assigned random numbers, but we don't know what they are. |
Luke | Feb 06, 2022 |
ePub | Page Chapter 1, Nested Functions, Code Code sample for chain_length_2 |
Code sample has errors. |
Ellery Chan | Apr 22, 2022 |
Printed | Page pages 32 and 33 Bottom diagram of page 32 and first diagram on page 33 |
This is in the Italian translation of the First Edition. |
Anonymous | Sep 19, 2023 |
ePub | Page https://learning.oreilly.com/library/view/deep-learning-from/9781492041405/ch01.html John Cochrane, [Investments] Notes 2006 |
John Cochrane, [Investments] Notes 2006 => hyperlink is broken |
Gökçe Aydos | Sep 22, 2023 |
ePub | Page Appendix - Matrix Chain Rule "it isn’t too hard to see that the partial derivative of this with respect to x1" |
> it isn’t too hard to see that the partial derivative of this with respect to `x_1` |
Gökçe Aydos | Sep 28, 2023 |
Printed | Page 10 return statement in def chain_length_2() function |
In chain_length_2() function, the return statement is f2(f1(x)) but x is undefined. The return statement should be f2(f1(a)) which a is the input for the function. |
Anonymous | Oct 08, 2019 |
Page 10 Figure 1-7 |
The use of f1 f2 to indicate the composite f2(f1(x)) is confusing and non-standard. If the author wanted to pipe the functions sequentially to create the composite above then there is a standard way of doing this. Otherwise it should be simply noted. |
Bradford Fournier-Eaton | Nov 01, 2021 | |
Page 11 the Math formula |
1) Page 11 - the math formula. |
Peter Petrov | Mar 26, 2021 | |
Page 11 Chain Rule Equation |
As others have stated, the chain rule is incorrect: The chain rule does not represent the derivative of a particular (of two) functions (author notes it as f_{2}), it should be the derivative of the composite. |
Bradford Fournier-Eaton | Nov 01, 2021 | |
Page 13 In the function : chain_deriv_2 |
# df1/dx |
Pradeep Kumar | Oct 10, 2020 | |
Page 13 In the function : chain_deriv_2 |
There is no where it is mentioned what is plot_chain does. No codes are given in that chapter for reference neither its clear what does it do. This function is being used everywhere in the first chapter |
Pradeep Kumar | Oct 10, 2020 | |
Printed | Page 25 Last paragraph |
The text reads "...the gradient of X with respect to X." but it should read "...the gradient of N with respect to X." A gradient is a property of a function, not a vector. |
Jason Gastelum | Dec 25, 2020 |
Printed | Page 28 Chapter 1 |
"we compute quantities on the forward pass (here, just N)" |
Anonymous | Apr 29, 2020 |
Page 58, Code line number 15 |
In the backward pass, (if I am not wrong) we essentially want to find by how much value does the output changes when input is changes by some value. |
Prathamesh Waghmare | Sep 14, 2023 | |
Printed | Page 64 Tabel 2-1 Derivative table for neural network |
the partial derivative dLdP = -(forward_info[y] - forward_info[p]) should be -2 * (forward_info[y] - forward_info[p]), just like the explanation on page 51. |
Anonymous | Oct 25, 2019 |
Printed | Page 65 Paragraph "The overall loss gradient" |
I believe that in the Jupyter Notebook on GitHub in "loss_gradients" the values assigned to loss_gradients['B1'] and loss-gradients['W2'] are erroneously summed across axis=0 twice, in the original assignment for dLdB1 and dLdB2 and then again in the assignment to loss_gradients. This makes e.g. the loss gradient for B1 not a vector with 13 elements but a scalar, so that the gradient descent updates all elements of B1 with the same gradient value, which I think is not correct. The effect on the outcome seems minor, but the graph printed on p.67 looks somewhat different. |
Anonymous | Oct 27, 2020 |
Printed | Page 65 2nd |
|
Eugen Grosu | Jan 03, 2021 |
Printed | Page 66 Bottom |
The figure 2-13 is obviously the same as figure 2-6. There is no difference in the fit when comparing them. |
James Svacha | Jul 12, 2020 |
Printed, PDF | Page 88 section heading |
Heading is the same as the chapter title *and* the book title. |
Anonymous | Feb 29, 2020 |
Printed | Page 91 NeuralNetwork class invocations in the code |
The NeuralNetwork class, when used on page 91, is given a learning_rate parameter --- there's no learning_rate in the __init__ function for that class, and no methods in the class use the learning_rate. This is not surprising, as the learning-rate is something the Optimizer class (introduced on the following pages) cares about. |
David Mankins | Sep 13, 2023 |
Page 94 __init__ method of class Trainer |
The __init__ method is missing self.optim = optim before the setattr line. |
Rodrigo Stevaux | Oct 07, 2020 | |
ePub | Page 99 1st paragraph |
In the Lincoln library, required to run the cpde for chapter 4, 'lincoln.utils.np_utils' does not contain the function 'exp_ratios'. |
Steven Kaminsky | Jan 14, 2020 |
Page 166 The code for auto differentiation |
Auto differentiation code of book need to replace self.grad with backward_grad so as could calculate derivate correctly. |
Nanyu | Sep 21, 2023 |