October 2017
Beginner to intermediate
236 pages
7h 38m
English
In step 2 from the previous section, it takes the entire dataset USAairlineData2016 as an input and passes it through the select() function to keep only the variables that are relevant to this recipe. After that, it creates groups by taking each unique combination of origin and month variable. It then calculates summary statistics such as minimum, mean, median, and maximum of departure delay. The first few rows of the output from step 2 are given as follows:
> head(desStat) Source: local data frame [6 x 6] Groups: ORIGIN [1] ORIGIN MONTH MIN_DELAY MEAN_DELAY MEDIAN_DELAY MAX_DELAY <chr> <int> <int> <dbl> <dbl> <int> 1 ABE 1 -13 9.994186 -2 374 2 ABE 2 -15 15.841772 -3 449 3 ABE 4 -14 1.386905 -3 229 4 ABE 5 -15 2.777778 -4 ...