Cross-validation
If you have run the previous experiment, you may have realized that:
- Both the validation and test results vary, as their samples are different
- The chosen hypothesis is often the best one, but this is not always the case
Unfortunately, relying on the validation and testing phases of samples brings uncertainty along with a reduction of the learning examples dedicated to training (the fewer the examples, the more the variance of the estimates from the model).
A solution would be to use cross-validation, and Scikit-learn offers a complete module for cross-validation and performance evaluation (sklearn.cross_validation
).
By resorting to cross-validation, you'll just need to separate your data into a training and test set, and you will be ...
Get Python Data Science Essentials - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.