O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating the test and training datasets

Now that we are finished with our transformations, we will create the training and test data frames. We will perform a 50/50 split between training and test:

# Take a sample of full vectornrow(OnlineRetail) 
> [1] 536068 
pctx <- round(0.5 * nrow(OnlineRetail))set.seed(1)# randomize rowsdf <- OnlineRetail[sample(nrow(OnlineRetail)), ]rows <- nrow(df)OnlineRetail <- df[1:pctx, ]  #training setOnlineRetail.test <- df[(pctx + 1):rows, ]  #test setrm(df)# Display the number of rows in the training and test datasets.nrow(OnlineRetail) 
> [1] 268034 
nrow(OnlineRetail.test) 
> [1] 268034 

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required