Skip to Content
Practical Data Analysis Cookbook
book

Practical Data Analysis Cookbook

by Tomasz Drabas
April 2016
Beginner to intermediate content levelBeginner to intermediate
384 pages
8h 36m
English
Packt Publishing
Content preview from Practical Data Analysis Cookbook

Using regular expressions and GREL to clean up data

When cleaning up and preparing data for use, we sometimes need to extract some information from text fields. Occasionally, we can just split the text fields using delimiters. However, when a pattern of data does not allow us to simply split the text, we need to revert to regular expressions.

Getting ready

To follow this recipe, you need to have OpenRefine and virtually any Internet browser installed on your computer.

We assume that you followed the previous recipes and your data is already loaded to OpenRefine and the data types are now representative of what the columns hold. No other prerequisites are required.

How to do it…

First, let's have a look at the pattern that occurs in our city_state_zip ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Analysis Cookbook

Python Data Analysis Cookbook

Ivan Idris
Practical Simulations for Machine Learning

Practical Simulations for Machine Learning

Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning

Publisher Resources

ISBN: 9781783551668Supplemental Content