Data aggregation with Pandas DataFrames

Data aggregation is a term used in the field of relational databases. In a database query, we can group data by the value in a column or columns. We can then perform various operations on each of these groups. The Pandas DataFrame has similar capabilities. We will generate data held in a Python dict and then use this data to create a Pandas DataFrame. We will then practice the Pandas aggregation features:

  1. Seed the NumPy random generator to make sure that the generated data will not differ between repeated program runs. The data will have four columns:
    • Weather (a string)
    • Food (also a string)
    • Price (a random float)
    • Number (a random integer between one and nine)

    The use case is that we have the results of some sort ...

Get Python Data Analysis - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.