5.6 Correlating Data
Correlation measures the strength and direction of the relationship between two variables. Expressed another way, correlation measures the tendency of two variables to increase or decrease at the same time. This measure is often called the correlation coefficient. Returning to the earthquake data, we could ask the question “Is the magnitude of an earthquake correlated with its depth?”
Although several algorithms are available for calculating a correlation coefficient for a sample, we will use the Pearson correlation coefficient, which yields a number between −1 and 1 to measure the relationship between two variables. The formula for calculating this coefficient is
where and are the means of the two variables x and y
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access