April 2017
Beginner to intermediate
420 pages
9h 58m
English
You can use H2O's functionality to partition the data into train and test sets. The first thing to do is create a vector of random and uniform numbers for the full data:
> rand <- h2o.runif(bank, seed = 123)
You can then build your partitioned data and assign it with a desired key name, as follows:
> train <- bank[rand <= 0.7, ] > train <- h2o.assign(train, key = "train") > test <- bank[rand > 0.7, ] > test <- h2o.assign(test, key = "test")
With these created, it is probably a good idea that we have a balanced response variable between the train and test sets. To do this, you can use the h2o.table() function and, in our case, it would be column 64:
> h2o.table(train[, 64]) y Count 1 no 2783 2 yes 396Read now
Unlock full access