CHAPTER 7Identifying Anomalous Outliers: Part 2

THE PREVIOUS CHAPTER LOOKED at outliers, which are data points that differ significantly from the norm. These outliers could be caused by several issues including fraud or errors. In the previous chapter we looked at size as compared to the average size of the amounts in a data set (the summation test), the relative size of a group of transactions (the largest subsets test), and the growth in the size of a group of transactions over time (the largest subset growth test). The common thread was that we were looking at large amounts as compared to the average size of the records in our data set. The focus in this chapter is on being large as measured by the ratio of something to something else. That is, we will ask how large an item is compared to the other items in its subset and we will measure the size using a ratio.

The test reviewed in this chapter is called the relative size factor (RSF) test. This test has been found to be a powerful test for detecting both intentional and unintentional errors. The test identifies subsets where the largest amount is out of line or does not match with the other amounts in that subset. This difference in magnitude is often because the largest record either belongs to another subset, or belongs to the subset in question, but the numeric amount is incorrectly recorded.

The RSF test was developed in the mid-nineties as a result of a company in Cleveland wiring $600,000 in error to the bank account ...

Get Forensic Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.