The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
| Version |
Location |
Description |
Submitted By |
Date submitted |
Date corrected |
|
Page Front matter
Revision History for the Third Edition |
Link to errata page says:
"See <orielly_url>/catalog/errata.csp?isbn=9781492032649 for release details."
Which is the link for the 2nd edition. The 3rd edition link is:
<oreilly_url>/catalog/errata.csp?isbn=9781098125974
I have obfuscated the full URLs as this form does not allow submissions containing URLs.
Note from the Author or Editor: Great catch, thanks! I just fixed all the links to point to the 3rd edition.
|
Rasmi Elasmar |
Dec 26, 2022 |
Jan 20, 2023 |
|
Page xxiv
Acknowledgments |
"Siddha Gangju" should be "Siddha Ganju"
Note from the Author or Editor: Oh, really sorry about that! This is now fixed.
|
O'Reilly Media |
Jan 03, 2023 |
Jan 20, 2023 |
|
Page Ch 9, page 291
2nd paragraph |
The text says that thethas value is 1.5, but if we check the plots in the previous page, the max is around 1.65
Note from the Author or Editor: Great catch, thanks! Indeed, the max is around 1.65 (about 1.66, in fact). I fixed the text.
|
Juan Manuel Parrilla Gutierrez |
Jan 07, 2023 |
Jan 20, 2023 |
|
Page p71
footnote 11 |
(p71) In footnote,
sklearn.set_config(pandas_in_out=True) --> sklearn.set_config(transform_output="pandas")
Thank you! :)
Note from the Author or Editor: Great catch, thanks. This is now fixed.
|
Haesun Park |
Feb 16, 2023 |
Mar 07, 2025 |
|
Page p86
1st paragraph under the TIP |
Actually, make_column_selector is a class, not a function.
So I suggest that
"make_column_selector() function" --> "make_column_selector class"
Two "selector function" --> "selector"
Thank you!
Note from the Author or Editor: Good point, thanks. Here's the updated paragraph:
Since listing all the column names is not very convenient, Scikit-Learn provides a `make_column_selector` class that you can use to automatically select all the features of a given type, such as numerical or categorical. You can pass a selector to the `ColumnTransformer` instead of column names or indices. Moreover, if you don't care about naming the transformers, you can use `make_column_transformer()`, which chooses the names for you, just like `make_pipeline()` does. For example, the following code creates the same `ColumnTransformer` as earlier, except the transformers are automatically named `"pipeline-1"` and `"pipeline-2"` instead of `"num"` and `"cat"`:
|
Haesun Park |
Feb 16, 2023 |
Mar 07, 2025 |
|
Page p91, p92
Middle of page |
Hi Aurelien,
(p91, p92) In the middle of page,
I suggest to change `RandomForest` --> `RandomForestRegressor`
Thank you!
Note from the Author or Editor: Good catch, thanks. This is now fixed.
|
Haesun Park |
Feb 16, 2023 |
Mar 07, 2025 |
|
Page p92
In a TIP |
Hi Aurelien,
I think `memory` of `Pipeline` class is just parameter, not hyperparameter.
Thanks!
Note from the Author or Editor: Good point, thanks. I replaced "hyperparameter" with "parameter".
|
Haesun Park |
Feb 16, 2023 |
Mar 07, 2025 |
|
Page Page : 602
Section with the bird ( last second) |
While the encoder uses LSTM with 512 unit, in the following section it mentions GRU with 10 units.
This is confusing, I think.
Note from the Author or Editor: Thanks for your feedback.
Indeed, this was a really confusing mistake, my apologies. Here's the corrected note:
The `Bidirectional` layer will create a clone of the `LSTM` layer (but in the reverse direction), and it will run both and concatenate their outputs. So although the `LSTM` layer has 256 units, the `Bidirectional` layer will output 512 values per time step.
|
Deniz Turan |
Feb 18, 2023 |
Mar 07, 2025 |
|
Page Throughout the book
E.g. code of page 204 |
Since v1.17.0, numpy has a new interface for random number generation. It might be appropriate to upgrade the code in the next release of the book with this interface.
Note from the Author or Editor: Great suggestion, thanks!
I will definitely use the new API for the next edition.
In the meantime, if you want to switch to the new API, it's fairly simple:
Replace:
np.random.seed(42)
A = np.random.rand(2,3,4)
B = np.random.randn(2,3,4)
C = np.random.randint(5, 10, (2, 3, 4))
With:
rng = np.random.default_rng(seed=42)
A = rng.random((2,3,4))
B = rng.standard_normal((2,3,4))
C = rng.integers(5, 10, (2, 3, 4))
Also, if a function or class needs to generate random numbers, consider adding a `rng` argument/attribute.
|
Roland Leners |
Feb 20, 2023 |
Mar 07, 2025 |
|
Page Chapter 3, page 119
the code near the top |
>>> y_train_pred_forest = y_probas_forest[:, 1] >= 0.5 # positive proba ≥ 50%
>>> f1_score(y_train_5, y_pred_forest)
Instead of:
>>> y_train_pred_forest = y_probas_forest[:, 1] >= 0.5 # positive proba ≥ 50%
>>> f1_score(y_train_5, y_train_pred_forest)
Note from the Author or Editor: Good catch, thanks. This is now fixed.
|
Kostas Katis |
Feb 24, 2023 |
Mar 07, 2025 |
|
Page Preface
Conventions Used in This Book |
The "Punctuation" section mentions "Punctutation" (note the extra t).
I checked Amazon "look inside", which indicates the same error in the print version.
Note from the Author or Editor: Good catch, thank you! That's a typo, it should indeed be "punctuation".
|
Gary Shymkiw |
Aug 12, 2023 |
Mar 07, 2025 |
|
Page p.135
Paragraph after the plot |
The text states that “we will use … the dot() method for matrix multiplication”. However, the code snippet immediately following uses the @ operator for matrix multiplication, not dot().
Note from the Author or Editor: Good catch thanks. Fixed!
|
Brett Karopczyc |
Jul 28, 2024 |
Mar 07, 2025 |
|
Page pg 200
Equation 6.2 |
In Eq. 6.2, m is undefined. It probably is m = m_left + m_right.
Note from the Author or Editor: Good catch, thanks! It's indeed m = m_left + m_right.
|
Sashwat Tanay |
Aug 12, 2024 |
Mar 07, 2025 |
|
Page Chapter 2, Section - "Look for Correlations"
4th paragraph |
There is error in this line ->
"Since there are now 11 numerical attributes, you would get 11 ^ 2 = 121 plots, which would not fit on a page"
Error -> We have already dropped "income_cat" feature in previous section using following code
"for set_ in (strat_train_set, strat_test_set):
set_.drop("income_cat", axis=1, inplace=True)"
So there are total 10 attributes in "housing" dataframe now out of which 9 are numerical which means we would get 9 ^ 2 = 81 plots instead of 121 as written in book.
Note from the Author or Editor: Good catch, thanks. Indeed, it should have been 81, not 121. Fixed!
|
Jalaj Negi |
Aug 21, 2024 |
Mar 07, 2025 |
|
Page Pg 789, Appendix B
Pg 789, Appendix B |
Dear Mr. Geron,
I hope this message finds you well. I would like to report a few issues I encountered in Figure B2 and the explanation of autodifferentiation in the accompanying text. Please see the details below:
1. Missing Function Definition in Figure B2: The function f is not explicitly mentioned in Figure B2. The correct function should be f(x, y) = y \cdot x^2 + y + 2 .
2. Missing Multiplication in Figure B2: In Figure B2, one of the blue blocks lacks the indication of the mathematical operation of multiplication, which seems to be missing from the diagram.
3. Unclear Explanation of Autodiff: Just below Equation B3, the explanation of autodiff with a general function h is unclear. While the implementation of autodiff with basic operations (addition, subtraction, multiplication) is understandable, it is not clear how autodiff would work (operationally on a computer) when the function h cannot be written in closed form. Could you clarify how autodiff is implemented in such cases when h does not have a closed-form expression?
In short, Appendix B is badly written.
Thank you for considering these points, and I look forward to your response.
Best regards,
Sashwat Tanay
Note from the Author or Editor: Thanks for your feedback!
1. The function f is defined at the start of the appendix, but you're right, it should also be visible on the figure. I just added it.
2. Thanks, indeed the multiplication sign was missing. Fixed! :)
3. What kind of non-closed-form example do you have in mind? Loops and conditionals? If so, there are two options: eager mode or graph mode. In eager mode, the computation graph is built on the fly during the forward pass, so you can use loops and conditionals without any problem (e.g., if you run a long loop, the graph will just become large, and the backward pass will work normally on that graph). In graph mode, TensorFlow will build a new computation graph that will contain nodes for the loops and conditionals, and it will run the appropriate operations during forward and backward pass.
I could add this to appendix B, but I think it might confuse the reader, and I see it as a technical detail. The main idea of reverse-mode autodiff is that you go forward through the computation graph, you store all the intermediate results, and you flow the gradients backward by applying the chain rule at each step.
|
Sashwat Tanay |
Sep 26, 2024 |
Mar 07, 2025 |
|
Page Pg 306
Pg 306 |
In Chapter 10, specifically with Equations 10.2 and 10.3, the output seems to be represented by different notations. In Equation 10.2, the predicted output is denoted by h_(W,b)(X) (or phi ), while in Equation 10.3, the letter y is used. This variation in notation may lead to confusion, as it’s not immediately clear if these are conceptually different or if they represent the same output in different contexts.
Note from the Author or Editor: You're right, I replaced h_{W,b}(X) with \hat{Y} in equation 10.3, and I added a line explaining that it's the output matrix, with one row per instance, and one column per neuron.
|
Sashwat Tanay |
Sep 26, 2024 |
Mar 07, 2025 |
|
Page Chapter 2 "Select a Performance Measure"
Equation 2-1 |
In Equation 2-1 for RMSE, the parentheses are not balanced. There should be an opening parenthesis before h.
Note from the Author or Editor: Good catch, thanks, this is now fixed.
|
Apurva Singh |
Dec 28, 2024 |
Mar 07, 2025 |
|
Page Chapter 3, "Performance Measures... The ROC Curve", p. 117- 118
p.117, middle of page; p. 118, WARNING box, top of page |
"Luckily, it [RandomForestClassifier] has a predict_proba() method that returns class probabilities for each instance, and we can just use the probability of the positive class as a score, so it will work fine. [Footnote 4] ...
Footnote 4: Scikit-Learn classifiers always have either a decision_function() method or a predict_proba() method, or sometimes both.
WARNING
These are estimated probabilities, not actual probabilities. ...
The sklearn.calibration package contains tools to calibrate the estimated probabilities and make them much closer to actual probabilities.
See the extra material section in this chapter’s notebook for more details."
Corrections:
a) Given this WARNING, "so it will work fine" seems misleading and incorrect.
b) The chapter 3 notebook at
github dot com/ageron/handson-ml3/blob/main/03_classification.ipynb
is missing code that uses the sklearn.calibration package or that addresses the topic of "calibration".
c) I think this WARNING should be integrated into the core text of the textbook, and taught prior to showing how to obtain so-called "probabilities" using the predict_proba() method, because, some people might skip the warnings.
I think understanding the difference between non-calibrated and calibrated probabilities is an important point too many textbooks omit.
Note from the Author or Editor: Thanks for your message. I agree that it's important to make it clear that classifiers output *estimated* probabilities, not actual probabilities.
a) You're right, the phrase "so it will work fine" was unfortunate, I was referring to the fact that you wouldn't get an error, not that there's nothing to worry about. I rephrased this for the next reprints and the next edition.
b) I initially intended to add a code example showing how to use sklearn.calibration, but decided that the documentation was clear enough, but I forgot to remove the reference in the text. I just removed it.
c) The scorpion notes are intended to stand out and represent important warnings. I asked O'Reilly and several friends, and they all agree that important messages should be put in such boxes. However, I replaced "class probabilities" with "estimated class probabilities" the first times these terms are used, which is just before the warning message.
Thanks again for your feedback!
|
Tom Cal |
Feb 02, 2025 |
Mar 07, 2025 |
|
Page Chapter 3: Classification Page 103
Last paragraph. |
I can't seem to find mnist_784 dataset anywhere. The URL for fetching the dataset from open ml page is no longer valid. I can't proceed with Chapter 3. Please help.
Note from the Author or Editor: Thanks for your feedback.
This was probably due to a temporary server error at openml.org (an organization that hosts many datasets), maybe the one that I reported in August? They fixed the issue after a few days, so it should work now. Could you please try again?
|
Deepak Sanghi |
Aug 10, 2025 |
|
|
Page Chapter 5, Nonlinear SVM Classification, SVM Classes and Computational Complexity
1st, 2nd, 3rd paragraph and Table 5-1 |
Big O notation is rendered incorrectly: e.g., instead of O (m × n) it is ݓm_ × n). I checked it in the latest versions of Google Chrome and Safari.
Note from the Author or Editor: Good catch, thanks.
I used U+1D4AA Mathematical Script Capital O Unicode Character, but unfortunately it wasn't supported by the rendering tools downstream. So I replaced it with a regular capital O.
|
Victor Khaustov |
May 31, 2022 |
Oct 03, 2022 |
|
Page Chapter 7, Bagging and Pasting in Scikit-Learn
Code example after the 1st paragraph |
n_jobs=-1 argument should be added to the code example to match the description and the GitHub notebook.
Note from the Author or Editor: Indeed, the book and notebook were not in sync. I added n_jobs=-1 to the book, thank you!
|
Victor Khaustov |
Jun 10, 2022 |
Oct 03, 2022 |
|
Page Chapter 7, Random Forests
1st paragraph |
The following sentence should be updated: "The following code trains a Random Forest classifier with 500 trees, each limited to maximum 16 nodes, and using all available CPU cores:". Instead of "maximum 16 nodes" -> "maximum 16 leaf nodes". The tree will have maximum 16 leaf nodes and 15 (16 - 1) split nodes.
Note from the Author or Editor: Great catch, thanks. As you suggested, I replaced "maximum 16 nodes" with "maximum 16 leaf nodes".
|
Victor Khaustov |
Jun 10, 2022 |
Oct 03, 2022 |
|
Page Chapter 7, Boosting, Histogram-Based Gradient Boosting
2nd paragraph |
Another formatting issue for the Big O notation: ݓn_×m) -> O (n×m) and ݓn_×m×log(m)) -> O (n×m×log(m)).
Note from the Author or Editor: Good catch, thanks.
I used U+1D4AA Mathematical Script Capital O Unicode Character, but unfortunately it wasn't supported by the rendering tools downstream. So I replaced it with a regular capital O.
|
Victor Khaustov |
Jun 10, 2022 |
Oct 03, 2022 |
|
Page Chapter 7, Boosting, Histogram-Based Gradient Boosting
2nd paragraph |
Either clarification is required for the following sentence "In practice, this means that HGB can train hundreds of times faster than regular GBRT on large datasets." or the big O notation should be updated to show that "n" is not the same (e.g., n<sub>bins</sub> may be used to show that it is not the number of features but the number of bins).
Note from the Author or Editor: Excellent point, thank you. I replaced the previous sentence with:
As a result, this implementation has a computational complexity of _O_(_b_×_m_) instead of _O_(_n_×_m_×log(_m_)), where _b_ is the number of bins, _m_ is the number of training instances, and _n_ is the number of features.
|
Victor Khaustov |
Jun 10, 2022 |
Oct 03, 2022 |
|
Page Chapter 7, Stacking
2nd paragraph |
"can be simply be copied" -> "can simply be copied"
Note from the Author or Editor: Good catch, thanks!
|
Victor Khaustov |
Jun 10, 2022 |
Oct 03, 2022 |
|
Page 47
Figure 2-4: Your notebook in Google Colab |
Figure contents out of sync with Github version
Note from the Author or Editor: Good catch, thanks, this is now fixed.
|
Morten Hoffmann |
Nov 15, 2023 |
Mar 07, 2025 |
|
Page 64
2nd paragraph |
Because combination attributes is not added yet, there are 9 numerical attributes.
So, "there are now 11 numerical attributes, you would get 11^2 = 121 plots" should be "there are now 9 numerical attributes, you would get 9^2 = 81 plots".
Thanks
Note from the Author or Editor: Good catch, thanks, this is now fixed.
|
Haesun Park |
Mar 07, 2023 |
Mar 07, 2025 |
|
Page 89
line 1 |
Wouldn't it be better to use `root_mean_squared_error()` instead of `mean_squared_error()`?
Note from the Author or Editor: Thanks for your suggestion. This function did not exist when I wrote the book. But yes, I agree that it should now be used instead of mean_squared_error(..., squared=False). I updated the book and notebook.
|
Ryoko |
Oct 16, 2024 |
Mar 07, 2025 |
|
Page 93
Second paragraph |
In the second paragraph on page 93, right below the code it states that the best model is obtained by setting max_features to 8, but the output of the code states it should be set to 6.
Note from the Author or Editor: Good catch, thanks! This is fixed now.
|
Sam |
Jul 03, 2023 |
Mar 07, 2025 |
|
Page 96
Last paragraph |
Student's t-distribution is used to compute the confidence interval for the generalization error i.e. the mean of the squared errors. I understand that this would require the squared errors to follow a normal distribution (see e.g. Wikipedia's page on Student's t-distribution). However they do not. The data has a cut-off at 0 and a long tail towards high values.
Do I miss something here?
Note from the Author or Editor: That's a great question, thanks!
The Central Limit Theorem (CLT) ensures that the mean of a large enough sample of some random variable (such as squared errors) follows a Normal distribution, regardless of the distribution of that random variable. Therefore, because our test set is large enough, we can safely use Student's t-distribution to estimate a 95% confidence interval for the MSE.
However, once we have a 95% confidence interval [a, b] for the MSE, it does not follow that [sqrt(a), sqrt(b)] is a 95% confidence interval for the RMSE: we may get a skewed interval, not centered on the median RMSE. That's because the square root is not a linear function. In practice, this approach still gives reasonably good results in practice.
However, a more rigorous way to compute a 95% confidence interval for the RMSE is to use SciPy's bootstrap() function:
from scipy.stats import bootstrap
def rmse(squared_errors):
return np.sqrt(np.mean(squared_errors))
confidence = 0.95
squared_errors = (final_predictions - y_test) ** 2
boot_result = bootstrap([squared_errors], rmse, confidence_level=confidence,
random_state=42)
rmse_lower, rmse_upper = boot_result.confidence_interval
I've updated the book and the notebook accordingly.
|
Roland Leners |
Jan 26, 2023 |
Mar 07, 2025 |
|
Page 101
Exercises 5. |
Automatically explore some preparation options using *GridSearchCV*.
->
Automatically explore some preparation options using *RandomSearchCV*.
Because line 5971 of 02_end_to_end_machine_learning_project.ipynb on the GitHub uses RandomSearchCV.
Note from the Author or Editor: Good catch, thanks. Yes, I initially planned to use GridSearchCV but then switched to RandomizedSearchCV and forgot to update the question. Fixed!
|
Ryoko |
Jul 13, 2024 |
Mar 07, 2025 |
|
Page 112
code block in the middle |
Hi,
After `>>> y_some_digits_pred = (y_scores > threshold)`, `>>> y_some_digits_pred` should be added.
Thank you!
Note from the Author or Editor: Great catch, thanks! This is now fixed.
|
Haesun Park |
Feb 23, 2023 |
Mar 07, 2025 |
|
Page 113
2nd paragraph |
Hi,
"the function adds a last precision of 0 and a last recall of 1" should be "the function adds a last precision of 1 and a last recall of 0".
Thank you!
Note from the Author or Editor: Great catch, thanks! This is now fixed.
|
Haesun Park |
Feb 23, 2023 |
Mar 07, 2025 |
|
Page 118
Caution block |
Hi,
Last sentence of the caution block, "See the extra material... this chapter's notebook for more details".
But there is no extra material in the notebook.
Thanks!
Note from the Author or Editor: Thanks for your feedback. Indeed, I initially intended to add a code example, but Scikit-Learn's documentation for this topic is sufficient. I removed this sentence.
|
Haesun Park |
Feb 23, 2023 |
Mar 07, 2025 |
|
Page 120
Middle of page |
Hi,
As you may know, "the number of won duels plus or minus a small tweak(max +- 0.33) to break ties" is true if `break_ties=True`.
But `break_ties` parameter's default value is `False`.
So please add some explanation.
Thanks!
Note from the Author or Editor: Good point, thanks. I added a footnote just after "that won the most duels.":
In case of a tie, the first class is selected, unless you set the `break_ties` hyperparameters to `True`, in which case ties are broken using the output of the `decision_function()`.
Hope this helps.
|
Haesun Park |
Feb 23, 2023 |
Mar 07, 2025 |
|
Page 126,127
last paragraph |
Hi,
In the last paragraph on p126 and the first paragraph on p127, `ChainClassifier` should be `ClassifierChain`.
Thank you!
Note from the Author or Editor: Good catch, thanks. This is now fixed.
|
Haesun Park |
Feb 23, 2023 |
Mar 07, 2025 |
|
Page 132
df_output=pd.DataFrame(cat_encoder.transform(df_test_unknown... |
If you follow the code in the book, cat_encoder output is a sparse matrix.
However the code where df_output is created, the cat_encoder transform.toarray() method must be used.
The code in github defines cat_encoder(sparse_output=False), thuss it works, but for a reader just following the book, it will not work.
Note from the Author or Editor: Great catch, thanks. I updated the book to show the code that creates the new cat_encoder with sparse_output=False. Fixed!
|
vig |
Jun 12, 2024 |
Mar 07, 2025 |
|
Page 204
Figure 6-4 |
The decision tree in figure 6-4 is inconsistent with figure 6-5 (max_depth=2). The decision tree published on Github is consistent though.
Note from the Author or Editor: Good catch, thanks. Fixed now.
|
Roland Leners |
Feb 20, 2023 |
Mar 07, 2025 |
|
Page 204
Figure 6-4 |
I believe that Figure 6-4 is from the 2nd edition, and it is different from the one in the 3rd edition notebook.
Additionally, the explanation below Figure 6-2 needs to be updated to reflect this change.
Thanks!
Note from the Author or Editor: Good catch, thanks! This is now fixed.
|
Haesun Park |
Mar 07, 2023 |
Mar 07, 2025 |
|
Page 233
First line |
The page break between pages 232 and 233 garbles up the text.
It currently reads:
", and use these (page break) can be used as the input features..."
instead of:
", and use these (page break) as the input features ..."
Note from the Author or Editor: Good catch, thanks. Fixed!
|
Roland Leners |
Mar 13, 2023 |
Mar 07, 2025 |
|
Page 245
Last paragraph |
I think that understanding of the reasoning would be improved if the text stated that the singular values of s (as returned by np.linalg.svd() ) are sorted in descending order. I understand that this is the reason why one can select the first principal components for reducing dimensionality.
Note from the Author or Editor: Good point, thanks! I added "in the correct order" at the end of the paragraph:
So how can you find the principal components of a training set? Luckily, there is a standard matrix factorization technique called _singular value decomposition_ (SVD) that can decompose the training set matrix *X* into the matrix multiplication of three matrices *U* *Σ* *V*^⊺^, where *V* contains the unit vectors that define all the principal components that you are looking for, in the correct order, as shown in Equation 8–1.
|
Roland Leners |
Mar 15, 2023 |
Mar 07, 2025 |
|
Page 288
Just above the code snippet |
The code looks for a 2% threshold, but the paragraph above (twice) says 4.
The text in the Jupyter notebook on GitHub consistently uses 2.
Note from the Author or Editor: Good catch, thanks. This is now fixed.
|
Peter Drake |
Apr 06, 2023 |
Mar 07, 2025 |
|
Page 377
3rd paragraph |
The homl.info link to extra-anns is broken.
Note from the Author or Editor: Thanks for your feedback. This is now fixed.
|
Roland Leners |
May 08, 2023 |
Mar 07, 2025 |
|
Page 390
First line of code |
The code does not work because "decay is deprecated in the new Keras optimizer, pleasecheck the docstring for valid arguments, or use the legacy optimizer, e.g., tf.keras.optimizers.legacy.SGD.".
I understand that Power scheduling is replaced by the PolynomialDecay scheduler, though both are only equivalent as long as t/s << 1.
Probably the rest of the chapter on schedulers needs to be adapted as well to the Learning rate schedules API.
Finally it might be useful to specify that a "(training) step" consists of one iteration on one batch. The terms "step" and "iteration" could benefit from a definition somewhere in the book.
Note from the Author or Editor: Thanks for your feedback. The `decay` argument no longer works, so I updated the book and the notebook to use the `InverseTimeDecay` scheduler instead.
|
Roland Leners |
May 10, 2023 |
Mar 07, 2025 |
|
Page 420
Code snippet |
In the build function, the parent's build method is not called (contrary to what the text says). On Github it is called:
super().build(batch_input_shape) # must be at the end
Note from the Author or Editor: Thanks for your feedback. Calling `super().build(batch_input_shape)` at the end of the `build()` method used to be compulsory, but this changed, and now it's no longer needed. I decided to simplify the code by removing this line everywhere in the book and the notebooks.
|
Roland Leners |
Jun 21, 2023 |
Mar 07, 2025 |
|
Page 425
Footnote |
The footnote states: "Due to TensorFlow issue #46858, the call to super().build() may fail in this case, unless the issue was fixed by the time you read this. If not, you need to replace this line with self.built = True."
However, on the code snippet, neither "super().build()" nor "self.built = True" appear. They do appear in the Google colab and and GitHub.
The footnote (as well as the code in Google colab and GitHub) should be removed, as "super().build()" is deprecated in the latest TensorFlow versions (> v2.5.0), Keras will call it for us.
Note from the Author or Editor: Good catch, thanks. Indeed, calling super().build() or setting self.built = True is no longer needed, and I removed it from the code in the book, but I forgot to remove the footnote. I'll also update the Colab notebooks now.
|
Marcos Rodrigo |
Aug 09, 2023 |
Mar 07, 2025 |
|
Page 483
Figure 14-3 |
In Fig 14-3, the bottom blue 3x3 square should be sifted by one square, since the stride in that figure is 1.
Note from the Author or Editor: Great catch, thanks a lot! This figure was correct in the previous editions, but it was not very pretty, so we tried to beautify it, but in the process we made a copy/paste error with Figure 14-4. My apologies, I should have caught that. This is now fixed.
|
Eugenio Marco Rubio |
Dec 12, 2022 |
Jan 20, 2023 |
|
Page 483
Figure 14-3 |
In addition to the error reported earlier (stride = 1 and not 2), the accolade representing fw=3 is one unit too large.
Note from the Author or Editor: Thanks! It's fixed now.
|
Roland Leners |
Jun 15, 2023 |
Mar 07, 2025 |
|
Page 489
last 2 paragraphs |
The last 2 paragraphs of page 489 in the printed edition state (sorry for the subscripts not showing up correctly):
"With padding="valid", if the width of the input is ih, then the output width is equal to (ih – fh + sh) / sh, rounded down. Recall that fh is the kernel width, and sh is the horizontal stride. Any remainder in the division corresponds to ignored columns on the right side of the input image. The same logic can be used to compute the output height, and any ignored rows at the bottom of the image.
With padding="same", the output width is equal to ih / sh, rounded up. To make this possible, the appropriate number of zero columns are padded to the left and right of the input image (an equal number if possible, or just one more on the right side). Assuming the output width is ow, then the number of padded zero columns is (ow – 1) × sh + fh – ih. Again, the same logic can be used to compute the output height and the number of padded rows."
In all other locations, for example, fh refers to the height of the kernel/receptive field, not the width. It seems like these last 2 paragraphs are mixing up height and width in a couple of locations (i.e. the subscripts should be all w's or all h's).
Note from the Author or Editor: Good catch, thanks. You're right, all the h subscripts in these two paragraphs should we w subscripts, my apologies for the confusion.
|
Michael Gilbert |
Apr 15, 2025 |
|
|
Page 514
line 2 |
The addition sign seems incorrect, the multiplication sign would be correct.
\alpha + \beta^2 + \gamma^2
should be:
\alpha \times \beta^2 \times \gamma^2
Note from the Author or Editor: Good catch, thanks. This is now fixed.
|
Ryoko |
Sep 30, 2024 |
Mar 07, 2025 |
|
Page 563
`to_seq2seq_dataset` function code block |
To my understanding, the lambda function S should actually be defined as:
>>> lambda S: (S[:, 0], S[:, 1:, target_col])
instead of
>>> lambda S: (S[:, 0], S[:, 1:, 1])
Otherwise, the `to_seq2seq_dataset` function parameter `target_col` is not even used.
Note from the Author or Editor: Good catch, thanks! I fixed this in the book and the notebook.
|
Riccardo Trevisan |
Mar 09, 2023 |
Mar 07, 2025 |
|
Page 658
line just above Figure 17-12 |
Figure 17-12 shows the *12* generated images.
->
Figure 17-12 shows the *21* generated images.
Note from the Author or Editor: Good catch, thanks! This is now fixed.
|
Ryoko |
Oct 19, 2024 |
Mar 07, 2025 |
| Printed |
Page 788
Figure B-1 |
All five periods in Figure B-1 should be changed to dots.
(3rd equation)
\partial(u.v) / \partialx = \partial v / \partial x. u + v.\partial u / \partial x
->
\partial(u \cdot v) / \partial x = \partial v / \partial x \cdot u + v \cdot \partial u / \partial x
(bottom right side equation)
\partial g / \partial x = 0 + (0.x + y.1) = y
->
\partial g / \partial x = 0 + (0 \cdot x + y \cdot 1) = y
|
Ryoko |
Oct 18, 2024 |
Mar 07, 2025 |
| Printed |
Page 828
Last entry in the index on the page |
Incomplete reference in index: sklearn.model_selection.RandomizedSearchCV is already introduced on pages 93/94 (in addition to p. 248)
|
Roland Leners |
Feb 16, 2023 |
Mar 07, 2025 |