Skip to Main Content
Statistical Tableau
book

Statistical Tableau

by Ethan Lang
May 2024
Beginner to intermediate content levelBeginner to intermediate
316 pages
7h 54m
English
O'Reilly Media, Inc.
Book available
Content preview from Statistical Tableau

Chapter 7. Anomaly Detection on Nonnormalized Data

In Chapter 6, I showed you three ways to visualize outliers when your data is normally distributed. However, oftentimes you will come across data that isn’t normally distributed. Using methods that assume a normal distribution could lead to false conclusions or misguided decisions by you and your stakeholders. That is why the exploratory tactics covered in Chapter 4 are so important.

In this chapter, I will show you three methods you can implement to visualize outliers when you are working with nonnormalized data. The methods are mean absolute deviation, Tukey’s fences, and modified z-score test.

Understanding Median Absolute Deviation

The median absolute deviation (MAD) is a statistical measure that quantifies the dispersion or variability of a dataset. It is calculated by finding the absolute deviation of each data point by subtracting the median from each value and taking the absolute value of the result. Then you find the median of the absolute deviations, which gives you the MAD. The mathematical formula to calculate the MAD is as follows:

MAD = Median ( | X i - Median | )

where

MAD = median absolute deviation

Xi = each value

Median = median value

The steps to find the MAD are very simple when you break this formula down. Consider this dataset as an example: 5, 10, 12, 15, 18. Here are the steps to find the MAD from this sample dataset:

  1. Find the median. In this dataset you can see that the median value is ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science for Business

Data Science for Business

Foster Provost, Tom Fawcett
R for Data Science, 2nd Edition

R for Data Science, 2nd Edition

Hadley Wickham, Mine Çetinkaya-Rundel, Garrett Grolemund

Publisher Resources

ISBN: 9781098151782Errata Page