Chi-squared multiple significance testing

Not all categories are dichotomous (such as male and female, survived and perished). Although we would expect categorical variables to have a finite number of categories, there is no hard upper limit on the number of categories a particular attribute can have.

We could use other categorical variables to separate out the passengers on the Titanic, such as the class in which they were traveling. There were three class levels on the Titanic, and the frequency-table function we constructed at the beginning of this chapter is already able to handle multiple classes.

(defn ex-4-12 []
  (->> (load-data "titanic.tsv")
       (frequency-table :count [:survived :pclass])))

This code generates the following frequency table: ...

Get Clojure for Data Science now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.