Aggregation
The most straightforward way of summarizing data is calling the aggregate
function from the stats
package, which does exactly what we are looking for: splitting the data into subsets by a grouping variable, then computing summary statistics for them separately. The most basic way to call the aggregate
function is to pass the numeric vector to be aggregated, and a factor variable to define the splits for the function passed in the FUN
argument to be applied. Now, let's see the average ratio of diverted flights on each weekday:
> aggregate(hflights$Diverted, by = list(hflights$DayOfWeek), + FUN = mean) Group.1 x 1 1 0.002997672 2 2 0.002559323 3 3 0.003226211 4 4 0.003065727 5 5 0.002687865 6 6 0.002823121 7 7 0.002589057
Well, it took ...
Get Mastering Data Analysis with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.