June 2017
Beginner to intermediate
576 pages
15h 22m
English
Since both the test and training datasets have been generated by simulated random normal distributions, the data is almost too perfect. To compensate for this, and to make the data more realistic and help you develop more simulation skills, we can introduce some random error into the test data set. For example, we might want to add up to a plus/minus 10% variation into the test data variables.
To accomplish this, we will first generate an error distribution that we can apply to the original data. For simplicity's sake, we can derive four discrete error bins, each designating a different percentage adjustment to the data. Percentage errors between -10% and +10% are reasonable limits.
We will generate ...