Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Errata for Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Second Edition

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
Printed Page p.64
Second paragraph

9 numerical attributes, not 11

Michael VanValkenburgh  Nov 17, 2022 
Printed Page Chapter 1 page 32
2nd paragraph

"You can see that the regularization forced the model to have a smaller slope"
should be "bigger slope" for this data according to the figure.

Karim Badr  Dec 02, 2022 
PDF, ePub Page Under the "Reinforcement learning" (there is no page no but it's shows 38 in my pdf reader)
Start of the Reinforcement Learning paragraph

"Reinforcement Learning isag a very different beast". I'm pretty sure "isag" was meant to be "is"

Rayan Khan  Mar 15, 2023 
Printed Page Chapter 3, page 101
Paragraph 2 following the code chunk

The text states the previous code chunk using a SVC used OvO strategy, training 45 binary classifiers. However, we can see the length of the decision function scores is 10. Thus an OvR strategy is used. This is consistent with scikit-learn documentation which states the default strategy is OvR.

Furthermore, when taking the book's code and specifying: decision_function_shape='ovo', we do indeed get an array of decision function scores for length 45, unlike what is printed.

Danly Omil-Lima  Apr 02, 2023 
Printed Page p39
Equation 2-1

Characterizing RMSE as

> RMSE(X, h) = ...

where (p40)

> X is a matrix containing all the feature values (excluding the label)

suggests that the labels are not an input to the RMSE function.

Same comment for MAE(X, h) on p41

Sean Fitzgibbon  Jun 02, 2023 
Printed Page p41
penultimate bullet

the formula for

> lk norm of a vector containing n elements

refers to v0, v1, ... vn , which is (n+1) elements.

Sean Fitzgibbon  Jun 02, 2023 
Printed Page Page 420, Chapter 12 (Custom Layers)
Second paragraph, code snippet

The build() method is missing the super().build(batch_input_shape) call that is referenced in the code walk through latter in the same page.

Ricardo Acevedo  Sep 02, 2023 
PDF Page pg. 159
Last paragraph, above the Tip

There is a mismatch of information between the following conde snippet's output and the paragraph. (Code output: max_features: 6 vs Paragraph: max_features: 8)

>>> grid_search.best_params_
{'preprocessing__geo__n_clusters': 15, 'random_forest__max_features': 6}

The paragraph claims that the best model is obtained by setting max_features to 8.

Running the code in a Pycharm environment also results in max_features:6.

I believe that in the paragraph 8 should be replaced with 6.

Cristian Tanase  Oct 25, 2023 
Printed Page Page 164, Third Edition
After equation 4-16

After Equation 4-16 it is stated „This cost function makes sense because -log(t) grows very large when t approaches 0, so the cost will be large if the model estimates a probability close to 0 [..]“

As I understand it, there should be „-log(p)“ instead of „-log(t)“ in this paragraph and the following sentences. In fact, the probability (i.e. sigma(t) ) gets zero only for negative values of t, whereas log(t) is not defined for negative t. This contradicts the quoted statement above.

Anonymous  Feb 21, 2024 
Printed Page 46
Last paragraph

In the next to last sentence (beginning "Notice that Jupyter notebooks use Markdown..."), the example markdown for linked text is reversed. The link text is placed within the square brackets, while the URL is placed within the parentheses.

Brett Karopczyc  Aug 04, 2023 
Printed Page 116, Chapter 2, Looking for correlations
First code snippet

Trying to run the whole code written in pycharm, the following yields an error:
corr_matrix = housing.corr() because of the "ocean_proximity" parameter beeing a string.

To ensure things work better, a parameter can be added to the corr() method to force it only to use numeric values as follows:
corr_matrix = housing.corr(numeric_only = True)

Cristian Tanase  Sep 28, 2023 
Printed Page 154
Figure 5.1, right plot

When running the notebook with scikit-learn version 1.2.1 the plot is generated, but the margin lines do not pass through the support vectors.

Uli Raich  Jan 30, 2023 
Mobi Page 158
second paragraph

A great alternative is to use Scikit-Learn’s k_-fold cross-validation feature.

Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (p. 159). O'Reilly Media. Kindle Edition.

I think it should be "k-fold" instead of "k_-fold"

David Dieulivol  Apr 25, 2023 
Printed Page 283
figure 10-3

2nd printed edition, page 283, figure 10-3

Neuron "A" of the rightmost structure (C=A && !B) has two outputs going to neuron "C". I'm not sure this is correct. "A" fully determines the output of "C", "B" has no influence.

Shouldn't it be just one connection going from A to C ?

Anonymous  Feb 04, 2024 
ePub Page 289

I think the U and V matrices of the SVD are mixed up.
the coefficients are in the V matrix

but the unit vectors of the principal components are in the U matrix of the SVD.

Oli Holl  Aug 19, 2021 
PDF Page 323
Number of Hidden Layers

In section Number of Hidden Layers it says:
"To understand why, suppose you are asked to draw a forest using some drawing soft‐ ware, but you are forbidden to copy and paste anything."
and later it says "... and finally copy and paste this tree to make a forest, you would be finished in no time."
It is quite confusing first the "copy & paste" is forbidden and later problem is solved by using "copy & paste" which was not allowed in first place.
This analogy is confusing.

Anonymous  Jul 15, 2021 
Other Digital Version 326
first line of code in the page

y_representative_digits is written correctly in the books but in the Github as well as the google colab notebooks its written wrong. the y_representative_digits values are incorrect and this leads to an extremely low accuracy (10%). the erroneous line is:
y_representative_digits = np.array([
0, 1, 3, 2, 7, 6, 4, 6, 9, 5,
1, 2, 9, 5, 2, 7, 8, 1, 8, 6,
3, 2, 5, 4, 5, 4, 0, 3, 2, 6,
1, 7, 7, 9, 1, 8, 6, 5, 4, 8,
5, 3, 3, 6, 7, 9, 7, 8, 4, 9])

Khalid ElHaj  Feb 22, 2022 
Printed Page 350
1st sentence in the section "Pretraining on an Auxiliary Task"

Current: last option is to train a first neural network on an auxiliary task...
Should be: last option is to first train a neural network on an auxiliary task...

Velimir Graorkoski  Aug 16, 2023 
Printed Page 357
AdaMax paragraph

According to the mentioned reference paper (, it should rather be max(β2s, |∇θJ(θ)|) than max(β2s,∇θJ(θ)), step 5 becoming θ ← θ − ηm^⊘√(s^) (or θ ← θ + ηm^⊘√(s^) to remain consistent with Equation 11-8).

Anonymous  Oct 13, 2021 
Printed Page 389

A small detail for text consistency in the page: in "p.result()", "p.variables" and "p.reset_states()", "p" should be replaced by "precision" (as it is in the notebook).

Anonymous  Jun 22, 2023 
Printed Page 397
First sentence of the second paragraph under the section "Losses and Metrics based on Model Internals"

The first sentence ends without a space:
"... add_loss() method.For example..."

Velimir Graorkoski  Aug 19, 2023 
Printed Page 438
1st paragraph

It's written: "The API will also include a keras.layers.Discretization layer that will chop continuous data into different bins and encode each bin as a one-hot vector.". Actually, according to the Keras API Reference, the one-hot encoding is done only if you set the output_mode argument to "one_hot" or "multi_hot", which is not the default.

Anonymous  Aug 27, 2023 
Printed Page 487
Fully Convolutional Networks paragraph

If the convolutional layer given as an example outputs 200 feature maps each of size 1 × 1, as written above in the paragraph, the tensor shape of this output is [batch size, 200, 1, 1] and not [batch size, 1, 1, 200].

Anonymous  Mar 27, 2024 
Printed Page 490
First bullet

In this bullet point, you said that YOLOv3 outputs 45 numbers per grid cell - 5 bounding boxes times 4 coordinates plus 5 objectness scores plus 20 class probabilities, but this is true only for YOLOv1 where each grid cell was able to find only one object - which is pretty straightforward because we have only one set of class probabilities. Since YOLOv2 the class probabilities are decoupled from a single grid cell, here is the quote from YOLOv2 paper which is also valid for YOLOv3 because nothing changed in this topic:
"When we move to anchor boxes we also decouple the class prediction mechanism from the spatial location and instead predict class and objectness for every anchor box"

So according to this sentence if you have 20 classes and want to find 5 boxes for each grid cell you will have 125 numbers per grid cell 5 boxes x (4 [box size and localization] + 1 objectness score + 20 class probabilities)

Kamil Rafałko  Jan 11, 2021 
Printed Page 492
3rd paragrpah from the bottom of the YOLO section

"Faster-RCNN", should be "Faster R-CNN"

Velimir Graorkoski  Aug 23, 2023 
Printed Page 493
In the text between Figures 14-26 and 14-27

It's written :"Instead, they use a transposed convolutional layer: it is equivalent to first stretching the image by inserting empty rows and columns (full of zeros), then performing a regular convolution (see Figure 14-27).". I think it's worth mentioning that in this case the transposed convolutional layer is performed with strides = (2, 2) (and not the default strides = (1, 1)) and the regular convolution with strides = (1, 1) (as shown in the notebook).

Anonymous  May 02, 2024 
PDF Page 527
the code: [encoded] = np.array(...)


[encoded] = np.array(...) - 1


encoded = np.array(...) - 1


Eric Ho  Sep 01, 2021 
Printed Page 700
Last sentence of the suggestion paragraph

This is useful if you want do not want Tensorflow... -> This is useful if you do not want Tensorflow...

Velimir Graorkoski  Sep 02, 2023 
Printed Page 734
Between the solution of Exercise 7. (of Chapter 11) and Chapter 12's section.

"For the solutions to exercises 8, 9 and 10" should be "For the solution to exercise 8" as there are only 8 exercises in Chapter 11 and its corresponding Jupyter notebook.

Anonymous  May 18, 2023 
Printed Page 763
last paragraph above equation C-5

From what I understood, it says that t(wx+b)=1 can be used to to calculate the value of b. Then it mentions b=t-wx. Shouldn't it be b=1/t-wx?

SYED Hamza Mohiuddin  Sep 14, 2022 
Printed Page 795
Last sentence of the warning paragraph

The sentence starts with: "It best to assume..."
Should be: "It is best to assume..."

Velimir Graorkoski  Sep 05, 2023