6

Data Assembly

By now, you should be able to load data into pandas and do some basic visualizations. This part of the book focuses on various data cleaning tasks. We begin with assembling a data set for analysis by combining various data sets together.

Learning Objectives

  • Identify when needs to be combined

  • Identify whether data needs to be concatenated or joined together

  • Use the appropriate function or methods to combine multiple data sets

  • Produce a single data set from multiple files

  • Assess whether data was joined properly

6.1 Combine Data Sets

We first talked about tidy data principles in Chapter 4. This chapter will cover the third criterion in the original “Tidy Data” paper1: “each type of observational unit forms a table.”

1. Tidy ...

Get Pandas for Everyone: Python Data Analysis, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.