O'Reilly logo

Data Analysis with R - Second Edition by Tony Fischetti

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Checking for outliers, entry errors, or unlikely data points

Automatic outlier detection (sometimes known as anomaly detection) is something that a lot of analysts scoff at and view as a pipe dream. Though the creation of a routine that automagically detects all erroneous data points with 100 percent specificity and precision is impossible, unmistakable mis-entered data points and flagrant outliers are not hard to detect even with very simple methods. In my experience, there are a lot of errors of this type.

One simple way to detect the presence of a major outlier is to confirm that every data point is within some n number of standard deviations away from the mean of the group. assertr has a function, within_n_sds, in conjunction with the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required