Chapter 10: Outliers

Outliers are extreme points in our data set and they impact the accuracy of our models quite adversely.

You would remember this graph from the central tendency and dispersion chapter. I am going to use this graph to explain outliers. In the picture below, 96, 69, and 71 are outliers.

test scores bar diagram

How is this concept relevant to machine learning?

Can you afford to have outliers in your dataset? Yes, you can have them provided the number of such outliers is negligible. And a better way to handle outliers is to remove them. Outliers can skew the results of machine learning models. Unwanted inferences can result if outliers are present in ...

Get De-Mystifying Math and Stats for Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.