September 2017
Beginner to intermediate
304 pages
7h 2m
English
To avoid overfitting and make sure that our model can generalize, we are going to split our dataset into a training set and a test set, as was discussed in Chapter 3, Evaluation and Validation. We will not bother with a holdout set here, because we are only going to make one pass through our model training without an iterative back and forth between training and testing. However, if you are experimenting with various dependent variables and/or iteratively adjusting any parameters of your model, you would want to create a holdout set that you save until the end of your model development process for validation.
We will use github.com/kniren/gota/dataframe to create our training and test datasets and then ...
Read now
Unlock full access