Chapter 5. Data Analysis with pandas
Data! Data! Data! I can’t make bricks without clay!
Sherlock Holmes
This chapter is about pandas, a library for data analysis with a focus on tabular data. pandas is a powerful tool that not only provides many useful classes and functions but also does a great job of wrapping functionality from other packages. The result is a user interface that makes data analysis, and in particular financial analysis, a convenient and efficient task.
This chapter covers the following fundamental data structures:
| Object type | Meaning | Used for |
|---|---|---|
|
2-dimensional data object with index |
Tabular data organized in columns |
|
1-dimensional data object with index |
Single (time) series of data |
The chapter is organized as follows:
- “The DataFrame Class”
-
This section starts by exploring the basic characteristics and capabilities of the
DataFrameclass ofpandasby using simple and small data sets; it then shows how to transform aNumPyndarrayobject into aDataFrameobject. - “Basic Analytics” and “Basic Visualization”
-
Basic analytics and visualization capabilities are introduced in these sections (later chapters go deeper into these topics).
- “The Series Class”
-
This rather brief section covers the
Seriesclass ofpandas, which in a sense represents a special case of theDataFrameclass with a single column of data only. - “GroupBy Operations”
-
One of the strengths of the
DataFrameclass lies in grouping data according to a single or multiple ...