TOPIC 27

Correlation Coefficient

In 1970, the United States Selective Service instituted a draft to decide which young men would be forced to join the armed forces. Wanting to be completely fair, they used a random lottery process that assigned draft numbers to birthdays: those born on days with low draft numbers were drafted. But was the lottery process carried out in a fair, truly random manner? In this topic, you will learn a new technique for analyzing such data and answering this question.

Overview

In the previous topic, you saw how scatterplots provide useful visual information about the relationship between two quantitative variables. Rather than relying on visual impressions alone, however, it is also handy to have a numerical measure of the strength of association between two variables—just as you made use of numerical summaries for various aspects of a single variable's distribution. This topic introduces you to such a measure and asks you to investigate some of its properties. This measure, one of the most famous in statistics, is the correlation coefficient.

Preliminaries

  1. If every student in a class scores exactly ten points lower on the second exam than on the first exam, does that indicate a positive association, negative association, or no association between the two exam scores? (Activity 27-6)
  2. If every student in a class scores exactly half as many points on the second exam than on the first exam, does that indicate a positive association, negative association, ...

Get Workshop Statistics: Discovery with Data, Fourth Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.