Errata

Errata for Feature Engineering for Machine Learning

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
ePub	Page 1 Figure 2-17	Just wanted to share with you something that I think is a mistake in Figure 2-17. Ss it is said in the paragraph following the figure, this figure plots information in data space. So, if letter x refers to variables (which, from the formula placed in the figure, it does), the axis of this figure should not be labeled with x, but with the names of the observations (as done in Figure 2-2). Which should be effectively labelled as X1, X2, ... Xm are the dots.	Ramiro Heraclio	Apr 15, 2018
Printed	Page 3 3rd paragraph	Under the Features section, authors have mentioned, “A feature is a numeric representation of raw data.” Nevertheless, features can also be categorical. Is it supposed to be “A feature is a representation of raw data.”	Manoj Jayabalan	Mar 11, 2019
PDF	Page 3 Last Paragraph	If this was an overview from 30,000 feet, just about scrapes the surface OK. Misses the fact that the activitiy has a buisness purpose, in that there is a problem to solve and a target to show effectiveness, either against some existing technique or approach or to what the AI/ML solution is supposed to find out. It is fun to run the numbers but the point is to bring more efficient or better insight into a problem from the data.	Geoffrey Leigh	Jan 09, 2020
PDF	Page 61 3rd paragraph	In Chapter 4 an unusual variant of tfidf is described - with tf being raw word counts. This is not a default in most definitions; adjusting tf by document length is much more common. I'm not aware of implementations where the presented tfidf variant is a default; in scikit-learn, NLTK, gensim, ElasticSearch or "Recommended tf–idf weighting schemes" in Wikipedia tf is normalized by the document length by default. This is not a problem on its own; the problem is that while analysis in Chapter 4 is valid for the presented tfidf variant, it can't be generalized for more commonly used tfidf variants. This makes statements about tfidf misleading, e.g. "Tf-Idf = Column Scaling" is not true in most cases. More importantly, the analysis of tfidf effect ("Deep Dive: What Is Happening?", p72-75) doesn't hold if one of the "default" tfidf variants is used, as they're not just a column scaling. On p65 there is a note: "Note that doing tf-idf then l2 normalization is the same as doing l2 normalization alone." While technically correct, it is then illustrated with a code sample: a result of text.TfidfTransformer(norm=None).transform is column-wise L2-normalized. This is somewhat misleading, because the default text.TfidfTransformer(norm='l2') is doing a very different kind of normalization: rows are normalized, not columns. From the description it sounds like changing a default value of `norm` argument is just a way to get both normalized and unnormalized tfidf results without calling TfidfTransformer twice, but that's not the case: default text.TfidfTransformer() is doing a completely different computation, which changes all the following analysis - scaling is not column-wise. I think Chapter 4 should have used IDF scaling as an example, not TFIDF, or make it clear that analysis doesn't hold for default/common TFIDF implementations people will be using in practice.	Mikhail Korobov	Aug 28, 2018
Printed	Page 104 Equation 6-7, near the bottom	Objective function for principal components, matrix-vector formulation, should be wtXtXw (where "t" means "transpose") instead of wtw The same goes for Equation 6-8 in the next page.	Anonymous	Dec 11, 2020
Printed	Page 104 Equation 6-7	The objective function for principal components, matrix-vector formation should be maximizing w for z'z instead of w'w (where ' is used to indicate a transpose). Same with equation 6-8 on the next page (105) You cannot maximize w for w'w much as w is constrained to be a unit vector. Though it's most probably a typo, it changes the entire mathematical meaning of the objective function.	Anonymous	Feb 14, 2021
PDF	Page 136 last paragraph	Figure 8-3 illustrates examples ...... The center image contains vertical stripes; therefore, the horizontal gradient is zero. To be changed: therefore, the vertical gradient is zero. * horizontal stripes >>> horizontal gradient is zero * vertical stripes >>> vertical gradient is zero	Woohyun Kim	Aug 29, 2018
PDF	Page 163 1st row, 2nd row	The code does not work, as the defined feature_array on page 162 requires 3 arguments, but only 2 are provided, here.	Anonymous	Mar 06, 2021