Chapter 16

Data Mining Using Classic Statistical Methods

In This Chapter

arrow Understanding correlation

arrow Fitting straight lines to data

arrow Investigating the most common nonlinear model

Data miners are not purists, so no hard dividing line exists between the methods used by data miners and the methods used by traditional analysts. Data miners borrow from traditionalists when it is beneficial and practical to do so. The data miner’s toolkit includes some techniques that are familiar even to the strictest of classical statisticians.

Among the old-time favorite techniques of data mining are correlation, linear regression, and logistic regression. This chapter gives you the details on each.

Understanding Correlation

Would you wrap your lips around a car’s exhaust pipe and breathe in a lungful of fumes? Of course not! Why not? Because you know that it's not healthy to inhale exhaust fumes (and it would look ridiculous).

You probably also know that it's not healthy to inhale cigarette smoke. Why not? One reason is that cigarette smoke contains carbon monoxide, the same stuff that's in exhaust fumes.

Maybe you know this now, but a few decades ago, people were unaware of the dangers of smoking. ...

