The cross-validation parameter

Cross-validation takes the train-test split concept to the next stage. The aim of the machine learning exercise is, in essence, to find what set of model parameters will provide the best performance. A model parameter indicates the arguments that the function (the model) takes. For example, for a decision tree model, parameters may include the number of levels deep the model should be built, number of splits, and so on. If, say, there are n different parameters, each having k different values, the total number of parameters would be k^n. We generally select a fixed set of combinations for each of the parameters and could easily end with 100-1000+ combinations. We will test the performance of the model (for example, ...

Get Practical Big Data Analytics now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.