June 2017
Beginner to intermediate
576 pages
15h 22m
English
Proceed to create our test and train datasets. The objective will be to sample 80% of the data for the training set and 20% of the data for the test data set.
To speed up sampling somewhat, we can sequentially sample the tails of the sample_bin range for the test dataset and then use the middle for the training data. This is still a random sample, since sample_bin was originally generated randomly and the sequence or range of the numbers have no bearing on the randomness.