Checking for Randomness
After generating the random numbers as discussed in the previous sections, you may want to make sure that they really are random. To do this, check the distribution of the data inside the table of random data from a statistical point of view.
SQL>SELECT MIN (balance), MAX (balance)1, AVG (balance), STDDEV (balance)2FROM accounts;MIN(BALANCE) MAX(BALANCE) AVG(BALANCE) STDDEV(BALANCE) ------------ ------------ ------------ --------------- 10008.03 99889.97 54948.4654 25989.9271
As shown, the average (often referred to in statistics as the mean) balance is 54,948.4654 and the standard deviation is 25,989.9271. As per statistical analysis, here is the distribution of values inside a table:
Assume that A = average and S = standard deviation; thus:
About 68% of values lie within A − S and A + S
About 95% of values lie within A − 2 × S and A + 2 × S
About 99.7% of values lie between A - 3 × S and A + 3 × S
If the pattern of distribution is such, it is said to be in normal distribution . In my case, however, I want an even spread of data, not normally distributed. Here I have:
| A = 54,948.4654, and S = 25,989.9271 |
Therefore, 68% of the data lies between 28,958.5383 and 80,938.3925 as shown in the following expression:
| 54948.4654 - 25989.9271 = 28958.5383 |
and
| 54948.4654 + 25989.9271 = 80938.3925 |
These numbers indicate that the list is well varied, not too crowded around the average value. It therefore satisfies our definition of a truly random sample. In creating a test ...