O'Reilly logo

Learning pandas - Second Edition by Michael Heydt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The split, apply, and combine (SAC) pattern

Many data analysis problems utilize a pattern of processing data referred to as split-apply-combine. In this pattern, three steps are taken to analyze data:

  • A dataset is split into smaller pieces based on certain criteria
  • Each of these pieces are operated upon independently
  • All the results are then combined back and presented as a single unit

The following diagram demonstrates a simple split-apply-combine process to calculate the mean of values grouped by a character-based key (a or b):

The data is then split by the index label into two groups (one each for a and b). The mean of the values in each ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required