O'Reilly logo

Mastering Python for Data Science by Samir Madhavan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The chi-square test of independence

The chi-square test of independence is a statistical test used to determine whether two categorical variables are independent of each other or not.

Let's take the following example to see whether there is a preference for a book based on the gender of people reading it:

Flavour

Total

Biography

Suspense

Romance

Gender

280

60

120

100

Men

640

90

200

350

Women

920

150

320

450

 

The Chi-Square test of independence can be performed using the chi2_contingency function in the SciPy package:

>>> men_women = np.array([[100, 120, 60],[350, 200, 90]])
>>> stats.chi2_contingency(men_women)
(28.362103174603167, 6.9382117170577439e-07, 2, array([[ 136.95652174,   97.39130435,   45.65217391],
 [ 313.04347826, 222.60869565, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required