CHAPTER 5Benford's Law: Completing the Cycle
THE HIGH-LEVEL TESTS SUCH as the data profile, the histogram, the periodic graph, the descriptive statistics, the first-two digits test, and the Benford-related conformity statistics give us valuable insights into the internal diagnostics of our data. These tests give us a deeper understanding of the entity and our data, and they can also point us to large-scale frauds, errors, and anomalies when these issues are significant enough to affect the results. The two tests described in this chapter are more focused than our suite of high-level tests. The number duplication test tells us which numbers occurred most frequently in our data. The last-two digits test analyzes the last-two digits of our data and is often effective at detecting invented numbers. The tests can both be run in Excel, Access, or IDEA. The use of R to run the number duplication test is also demonstrated.
THE NUMBER DUPLICATION TEST
The number duplication test is a test that tells us which specific amounts were causing the spikes on the first-two digits graph. Spikes on the first-two digits graph are usually linked with some specific amounts occurring abnormally often. For example, the first-two digits of the District of Columbia purchasing card data in Figure 4.6 had 12 visible spikes. Also, the three largest Z-statistics were for the first-two digits 50, 24, and 25 in that order. The results are presented again, this time in IDEA. The dialog box is activated by ...