Aggregating the data to calculate summary statistics 

To aggregate values over some grouping, pandas has the groupby operationone of the library's killer features. This function creates a GroupBy object, which can behave as an iterable of (name, group) tuples, or similar to a dataframe, you can select one or many columns the same way you'd do for a dataframe.

Most importantly, those objects have two special methods:

  • agg, which will perform the given aggregation function (say, calculate averages) for each group, and return them as a dataframe with one row per each group.
  •  transform does all of the same—except that it will return the corresponding group's aggregate values for each row in the original dataframe.

The great part of both of ...

