O'Reilly logo

Mastering Machine Learning with R - Second Edition by Cory Lesmeister

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data manipulation with dplyr

Over the past couple of years I have been using dplyr more and more to manipulate and summarize data. It is faster than using the base functions, allows you to chain functions, and once you are familiar with it has a more user-friendly syntax. In my experience, just a few functions can accomplish the majority of your data manipulation needs. Install the package as described above, then load it into the R environment.

    > library(dplyr)

Let's explore the iris dataset available in base R. Two of the most useful functions are summarize() and group_by(). In the code that follows, we see how to produce a table of the mean of Sepal.Length grouped by the Species. The variable we put the mean in will be called average ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required