Univariate outlier detection

To explain the reason behind why a data point is an outlier, you are first required to locate the possible outliers in your data. There are quite a few approaches – some are univariate (you can observe each singular variable at once), while the others are multivariate (they consider more variables at the same time). The univariate methods are usually based on EDA and visualizations such as boxplots (which have been introduced at the beginning of the present chapter; we will talk more about boxplots more specifically in Chapter 5, Visualization, Insights, and Results).

There are a couple of rules of thumb to keep in mind when chasing outliers by examining single variables. In fact, outliers may be spotted as extreme ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.