O'Reilly logo

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining by Glenn J. Myatt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

5.4 COMPARATIVE STATISTICS

5.4.1 Overview

Correlation analysis looks at associations between variables. For example, is there a relationship between interest rates and inflation or education level and income? The existence of an association between variables does not imply that one variable causes another. Yet, understanding these relationships is useful for a number of reasons. For example, when building a predictive model, comparative statistics can help identify important variables to use.

images

Figure 5.18. Looking up critical F-statistic

images

Figure 5.19. Relationships between two variables

The relationship between variables can be complex; however, a number of characteristics of the relationship can be measured:

  • Direction: In comparing two variables, a positive relationship results when higher values in the first variable coincide with higher values in the second variable. In addition, lower values in the first variable coincide with lower values in the second variable. Negative relationships result when higher values in the first variable coincide with lower values in the second variable as well as lower values in the first variable coincide with higher values in the second variable. There are also situations where the relationship between the variables is more complex, having a combination ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required