Unit 31Getting Used to Pandas Data Structures

The module pandas adds two new containers to the already rich Python set of data structure: Series and DataFrame. A series is a one-dimensional, labeled (in other words, indexed) vector. A frame is a table with labeled rows and columns, not unlike an Excel spreadsheet or MySQL table. Each frame column is a series. With a few exceptions, pandas treats frames and series similarly.

Frames and series are not simply storage containers. They have built-in support for a variety of data-wrangling operations, such as:

  • Single-level and hierarchical indexing

  • Handling missing data

  • Arithmetic and Boolean operations on entire columns and tables

  • Database-type operations (such as merging and aggregation)

Get Data Science Essentials in Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.