Understanding cross-validation
Cross-validation is a method of evaluating the generalization performance of a model that is generally more stable and thorough than splitting the dataset into training and test sets.
The most commonly used version of cross-validation is k-fold cross-validation, where k is a number specified by the user (usually five or ten). Here, the dataset is partitioned into k parts of more or less equal size, called folds. For a dataset that contains N data points, each fold should thus have approximately N / k samples. Then a series of models is trained on the data, using k - 1 folds for training and one remaining fold for testing. The procedure is repeated for k iterations, each time choosing a different fold for testing, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access