March 2018
Beginner to intermediate
570 pages
13h 42m
English
Automatic outlier detection (sometimes known as anomaly detection) is something that a lot of analysts scoff at and view as a pipe dream. Though the creation of a routine that automagically detects all erroneous data points with 100 percent specificity and precision is impossible, unmistakable mis-entered data points and flagrant outliers are not hard to detect even with very simple methods. In my experience, there are a lot of errors of this type.
One simple way to detect the presence of a major outlier is to confirm that every data point is within some n number of standard deviations away from the mean of the group. assertr has a function, within_n_sds, in conjunction with the ...