Skip to Content
Data Preparation in the Big Data Era
book

Data Preparation in the Big Data Era

by Federico Castanedo
October 2015
Beginner to intermediate content levelBeginner to intermediate
15 pages
27m
English
O'Reilly Media, Inc.

Overview

Preparing and cleaning data is notoriously expensive, prone to error, and time consuming: the process accounts for roughly 80% of the total time spent on analysis. As this O’Reilly report points out, enterprises have already invested billions of dollars in big data analytics, so there’s great incentive to modernize methods for cleaning, combining, and transforming data.

Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, details best practices for reducing the time it takes to convert raw data into actionable insights. With these tools and techniques in mind, your organization will be well positioned to translate big data into big decisions.

  • Explore the problems organizations face today with traditional prep and integration
  • Define the business questions you want to address before selecting, prepping, and analyzing data
  • Learn new methods for preparing raw data, including date-time and string data
  • Understand how some cleaning actions (like replacing missing values) affect your analysis
  • Examine data curation products: modern approaches that scale
  • Consider your business audience when choosing ways to deliver your analysis
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Planning for Big Data

Planning for Big Data

Edd Wilder-James

Publisher Resources

ISBN: 9781492048329Errata Page