Understanding cross-validation

Cross-validation is a method of evaluating the generalization performance of a model that is generally more stable and thorough than splitting the dataset into training and test sets.

The most commonly used version of cross-validation is k-fold cross-validation, where k is a number specified by the user (usually five or ten). Here, the dataset is partitioned into k parts of more or less equal size, called folds. For a dataset that contains N data points, each fold should hence have approximately N / k samples. Then, a series of models are trained on the data, using k - 1 folds for training and one remaining fold for testing. The procedure is repeated for k iterations, each time choosing a different fold for ...

Get Machine Learning for OpenCV 4 - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.