Skip to Content
Pandas for Everyone: Python Data Analysis, First Edition
book

Pandas for Everyone: Python Data Analysis, First Edition

by Daniel Y. Chen
December 2017
Beginner to intermediate
410 pages
12h 45m
English
Addison-Wesley Professional
Content preview from Pandas for Everyone: Python Data Analysis, First Edition

4. Data Assembly

4.1 Introduction

By now, you should be able to load data into Pandas and do some basic visualizations. This part of the book focuses on various data cleaning tasks. We begin with assembling a data set for analysis by combining various data sets together.

Concept Map

1. Prior knowledge

a. loading data

b. subsetting data

c. functions and class methods

Objectives

This chapter will cover:

1. Tidy data

2. Concatenating data

3. Merging data sets

4.2 Tidy Data

Hadley Wickham,1 one of the more prominent members of the R community, talks about the idea of tidy data. In fact, he’s written a paper about this concept in the Journal of Statistical Software.2 Tidy data is a framework to structure data sets so they can be easily analyzed. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Pandas for Everyone: Python Data Analysis, 2nd Edition

Pandas for Everyone: Python Data Analysis, 2nd Edition

Daniel Y. Chen

Publisher Resources

ISBN: 9780134547046Purchase book