O'Reilly logo

R: Recipes for Analysis, Visualization and Machine Learning by Chiu Yu-Wei, Atmajitsinh Gohil, Shanthi Viswanathan, Viswa Viswanathan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating random data partitions

Analysts need an unbiased evaluation of the quality of their machine learning models. To get this, they partition the available data into two parts. They use one part to build the machine learning model and retain the remaining data as "hold out" data. After building the model, they evaluate the model's performance on the hold out data. This recipe shows you how to partition data. It separately addresses the situation when the target variable is numeric and when it is categorical. It also covers the process of creating two partitions or three.

Getting ready

If you have not already done so, make sure that the BostonHousing.csv and boston-housing-classification.csv files from the code files of this chapter are in your ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required