O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Breaking out summaries by groups

Following an initial inspection of the data, it is a good idea to look at various summary statistics of the target variable broken down by some of the categories (or factors). We could do this using SQL; however, for this example we will use a useful package called dplyr, which has syntax that is SQL-like, and it should be easy for anyone familiar with SQL and/or Linux to pick up.

One of our goals is to break down the Total.Costs by some of the factors to see if we can see any differences in costs among the levels. Let's start with something easy, by breaking out these Total.Costs by the day of the week. We will do this by piping the df dataframe to the dplyr group by command, which will then send it to a ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required