O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Selecting appropriate values of k using caret

Determining the number of nearest neighbors, that is, the value of k, plays a major role toward the efficacy of the model, thereby deciding how well the data can be utilized to generalize the results of the KNN algorithm. We will use the caret package to preprocess (center and scale) and train the data along with a validation mechanism to identify the best value of k automatically:

> trctrl <- trainControl(method = "repeatedcv", number = 10, repeats = 3)> caret_knn_fit <- train(Result ~ Family_size + Income, data = train, method = "knn",                 trControl=trctrl,                 preProcess = c("center", "scale"),                 tuneLength = 10)

The method parameter of the trainControl method holds the value for the resampling technique ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required