Errata

Scaling Machine Learning with Spark

Errata for Scaling Machine Learning with Spark

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
PDF Page 15
1st Paragraph on High Versus Low Bias

It's a great book, but I would like to bring a very minor error in language:-

In the section discussing "High Versus Low Bias," the book states in the second line: "A model with high bias makes too many assumptions about the results, leading to overfitting to the training data. Such a model tends to have difficulty making accurate predictions about new data that doesn’t exactly conform to the data it has already seen and will perform badly on test data and in production. Conversely, a model with low bias incorporates fewer assumptions about the data. Taken to an extreme, this can also be problematic as it can result in underfitting, where the model fails to learn enough about the data to classify it accurately." While the line correctly conveys that a model with high bias makes too many assumptions and leads to overfitting, it mistakenly associates high bias with underfitting. As per the information I have learned, High bias causes underfitting, and low bias leads to overfitting. I have looked at several resources on the internet which disagree with the book on this part, and as per what I was taught in class as well. However, I might be wrong, if so please clarify.

Dipit Vasdev  Dec 07, 2023