1

Anticipating Data Cleaning Issues When Importing Tabular Data with pandas

Scientific distributions of Python (Anaconda, WinPython, Canopy, and so on) provide analysts with an impressive range of data manipulation, exploration, and visualization tools. One important tool is pandas. Developed by Wes McKinney in 2008, but really gaining in popularity after 2012, pandas is now an essential library for data analysis in Python. The recipes in this book demonstrate how many common data preparation tasks can be done more easily with pandas than with other tools. While we work with pandas extensively in this book, we also use other popular packages such as Numpy, matplotlib, and scipy.

A key pandas object is the DataFrame, which represents data as ...

Get Python Data Cleaning Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.