Chapter 1

Data Preparation and Cleaning

Learning Objectives

By the end of this chapter, you will be able to:

  • Create pandas DataFrames in Python
  • Read and write data into different file formats
  • Slice, aggregate, filter, and apply functions (built-in and custom) to DataFrames
  • Join DataFrames, handle missing values, and combine different data sources

This chapter covers basic data preparation and manipulation techniques in Python, which is the foundation of data science.


The way we make decisions in today's world is changing. A very large proportion of our decisions—from choosing which movie to watch, which song to listen to, which item to buy, or which restaurant to visit—all rely upon recommendations and ratings generated ...

Get Data Science for Marketing Analytics now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.