Chapter 9
Preparing Data
In This Chapter
Documenting your business objectives
Processing your data
Sampling your data
Transforming your data
The roadmap to building a successful predictive model involves defining business objectives, preparing the data, and then building and deploying the model. This chapter delves into data preparation, which involves:
- Acquiring the data
- Exploring the data
- Cleaning the data
- Selecting variables of interest
- Generating derived variables
- Extracting, loading, and transforming the data
- Sampling the data into training and test datasets
Data is a four-letter word. It’s amazing that such a small word can describe trillions of gigabytes of information: customer names, addresses, products, discounted versus original prices, store codes, times of purchase, supplier locations, run rates for print advertising, the color of your delivery vans. And that’s just for openers. Data is, or can be, literally everything.
Not every source or type of data will be relevant to the business question you’re trying to answer. Predictive analytics models are built from multiple ...
Get Predictive Analytics For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.