GData Mining

Data mining aims to discover and make use of patterns and relationships that are manifest in data. There are many different methods that can be used for data mining, and linear regression is one. In a data mining context, we could use linear regression to build a model to predict numbers of interest. Then, once the model is built using a set of cases, we would assess how well the model can predict a new set of cases. Further, we'll also want to compare the model's performance to that of models constructed using other predictive analytics methods, such as neural networks and decision trees. Let's go through this process a step at a time.

We will no longer dwell on -values, but instead we'll focus on measures of fit such as Adjusted -squared. Further, we will not dwell so much on the adjusted -squared we get when we build a model, but instead we'll focus more on the adjusted -squared we get when we use the model to predict new cases.

Below is the regression equation we developed in Chapter 43. I will call it “The Model.”

The above adjusted -squared value of 0.67 tells ...

Get Illuminating Statistical Analysis Using Scenarios and Simulations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Illuminating Statistical Analysis Using Scenarios and Simulations by Jeffrey E. Kottemann

GData Mining

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly