O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Introducing errors into the test data set

Since both the test and training datasets have been generated by simulated random normal distributions, the data is almost too perfect. To compensate for this, and to make the data more realistic and help you develop more simulation skills, we can introduce some random error into the test data set. For example, we might want to add up to a plus/minus 10% variation into the test data variables.

To accomplish this, we will first generate an error distribution that we can apply to the original data. For simplicity's sake, we can derive four discrete error bins, each designating a different percentage adjustment to the data. Percentage errors between -10% and +10% are reasonable limits.

We will generate ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required