December 2015
Beginner to intermediate
202 pages
4h
English
Pandas is an open-source, high-performance library that provides easy-to-use data structures and data analysis tools for Python. Pandas was created to aid in the analysis of time series data, and has become a standard in the Python community. Not only does it provide data structures, such as a Series and a DataFrame, that help with all aspects of data science, it also has built-in analysis methods which we'll use later in the book.
Before we can start cleaning and standardizing data using Pandas, we need to get the data into a Pandas DataFrame, the primary data structure of Pandas. You can think of a DataFrame like an Excel document—it has rows and columns. Once data is in a DataFrame, we can use the ...
Read now
Unlock full access