Skip to Content
Practical Data Analysis Cookbook
book

Practical Data Analysis Cookbook

by Tomasz Drabas
April 2016
Beginner to intermediate content levelBeginner to intermediate
384 pages
8h 36m
English
Packt Publishing
Content preview from Practical Data Analysis Cookbook

Chapter 1. Preparing the Data

In this chapter, we will cover the basic tasks of reading, storing, and cleaning data using Python and OpenRefine. You will learn the following recipes:

  • Reading and writing CSV/TSV files with Python
  • Reading and writing JSON files with Python
  • Reading and writing Excel files with Python
  • Reading and writing XML files with Python
  • Retrieving HTML pages with pandas
  • Storing and retrieving from a relational database
  • Storing and retrieving from MongoDB
  • Opening and transforming data with OpenRefine
  • Exploring the data with OpenRefine
  • Removing duplicates
  • Using regular expressions and GREL to clean up the data
  • Imputing missing observations
  • Normalizing and standardizing features
  • Binning the observations
  • Encoding categorical variables

Introduction ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Analysis Cookbook

Python Data Analysis Cookbook

Ivan Idris
Practical Simulations for Machine Learning

Practical Simulations for Machine Learning

Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning

Publisher Resources

ISBN: 9781783551668Supplemental Content