© Hannah Stepanek 2020
H. StepanekThinking in Pandashttps://doi.org/10.1007/978-1-4842-5839-2_7

7. Groupby

Hannah Stepanek1 
(1)
Portland, OR, USA
 

Chances are at some point when working with data in pandas, you will need to do some sort of grouping and aggregation of data. This is what Groupby is for. It allows you to cluster your data into groups and run aggregated calculations on those groups.

Using groupby correctly

When starting out, you may be inclined to do something like Listing 7-1 where you cluster your data into groups, then loop over each group, and run some aggregate. This however results in terrible performance because just as we saw in Chapter 6, we are looping in Python and not in C. If instead you call the aggregate function directly ...

Get Thinking in Pandas: How to Use the Python Data Analysis Library the Right Way now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.