Measuring dependencies between discrete variables

You can pivot, or cross-tabulate two discrete variables. You measure counts, or absolute frequencies, of each combination of pairs of values of the two variables. You can compare the actual with the expected values in the table. So, what are the expected values? You start with the null hypothesis again—there is no association between the two variables you are examining. For the null hypothesis, you would expect that the distribution of one variable is the same in each class of the other variable, and the same as the overall distribution in the dataset. For example, if you have half married and half single people in the dataset, you expect such a distribution for each level of education. The ...

Get Data Science with SQL Server Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.