Univariate outlier detection

To explain the reason behind why a data point is an outlier, you are first required to locate the possible outliers in your data. There are quite a few approaches some are univariate (you can observe each singular variable at once), while the others are multivariate (they consider more variables at the same time). The univariate methods are usually based on EDA and visualizations such as boxplots (which have been introduced at the beginning of the present chapter; we will talk more about boxplots more specifically in Chapter 5, Visualization, Insights, and Results).

There are a couple of rules of thumb to keep in mind when chasing outliers by examining single variables. In fact, outliers may be spotted as extreme ...

Get Python Data Science Essentials - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.