O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How it works...

The createDataPartition() function randomly selects row indices from the array supplied as its first argument. Rather than selecting randomly from the entire data frame, it does a more intelligent sampling, as we now describe.

If supplied with a numeric vector as the first argument, then createDataPartition() applies the random selection process by percentile groups, so as to get a good sampling of rows from the entire range of the target variable. By default, it considers five groups, but we can control this through the optional groups argument.

If supplied with a vector of factors, the function randomly samples for each value of the factor from the cases, thereby ensuring a good representation of all factor values in the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required