Chapter 8. Machine Learning with the caret Package
So far, we’ve been doing machine learning in a very ad hoc manner. We
have some data, we want to fit a model to it, and then we tune the model to
give us the best result based on whatever sampling processes we might have
done and depending on how the data itself is organized. A lot of this
relies on the ability to recognize when to use certain algorithms.
Just by visualizing a set of data, we can usually determine whether we can slap a
linear regression on it, if it makes sense. Likewise, we’ve seen
examples for which data is better suited to be clustered via a kmeans
algorithm or something similar.
One issue that we’ve seen is that a lot of these algorithms can be very
different from one another. The options for the lm()
function are quite
different from that of the nnet()
function. Surely there exists something that
provides a common interface for all these different yet commonly used
algorithms. We’re in luck with R in that the caret
package offers a
powerhouse of tools for us to use to help streamline our model building.
The name “caret” is an acronym that stands for “Classification and Regression
Training,” but the package itself is capable of much more. In the R ecosystem, there are
hundreds of machine learning packages. Becoming familiar with the quirks
and special functionality for each one can be a daunting task. Lucky for
us, caret
provides a common interface for all of these packages. Caret also provides great functionality ...
Get Introduction to Machine Learning with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.