1. Data Exploration and Cleaning


In this chapter, you will take your first steps with Python and Jupyter notebooks, some of the most common tools data scientists use. You'll then take the first look at the dataset for the case study project that will form the core of this book. You will begin to develop an intuition for quality assurance checks that data needs to be put through before model building. By the end of the chapter, you will be able to use pandas, the top package for wrangling tabular data in Python, to do exploratory data analysis, quality assurance, and data cleaning.


Most businesses possess a wealth of data on their operations and customers. Reporting on this data in the form of descriptive charts, graphs, ...

Get Data Science Projects with Python - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.