Cross-validation takes the train-test split concept to the next stage. The aim of the machine learning exercise is, in essence, to find what set of model parameters will provide the best performance. A model parameter indicates the arguments that the function (the model) takes. For example, for a decision tree model, parameters may include the number of levels deep the model should be built, number of splits, and so on. If, say, there are n different parameters, each having k different values, the total number of parameters would be k^n. We generally select a fixed set of combinations for each of the parameters and could easily end with 100-1000+ combinations. We will test the performance of the model (for example, ...
The cross-validation parameter
Get Practical Big Data Analytics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.