7.1 INTRODUCTION

7.1.1 Overview

Predictive models are used in many situations where an estimate or forecast is required, for example, to project sales or forecast the weather. A predictive model will calculate an estimate for one or more variables (responses), based on other variables (descriptors). For example, a data set of cars is used to build a predictive model to estimate car fuel efficiency (MPG). A portion of the observations are shown in Table 7.1. A model to predict car fuel efficiency was built using the MPG variable as the response and the variables Cylinders, Displacement, Horsepower, Weight, and Acceleration as descriptors. Once the model has been built, it can be used to make predictions for car fuel efficiency. For example, the observations in Table 7.2 could be presented to the model and the model would predict the MPG column.

A predictive model is some sort of mathematical equation or process that takes the descriptor variables and calculates an estimate for the response or responses. The model attempts to understand the relationship between the input descriptor variables and the output response variables; however, it is just a representation of the relationship. Rather than thinking any model generated as correct or not correct, it may be more useful to think of these models as useful or not useful to what you are trying to accomplish.

Predictive models have a number of uses including:

  • Prioritization: Predictive models can be used to swiftly profile a data set ...

Get Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.