9Categorical Data

Ignorance more frequently begets confidence than does knowledge: it is those who know little, and not those who know much, who so positively assert that this or that problem will never be solved by science.

Charles Darwin (1843–1927)

A categorical variable is a variable that is nominal or ordinal in scale. Ordinal variables have more information than nominal ones because their levels can be ordered. For example, an automobile could be categorized in an ordinal scale (compact, mid‐size, large) or a nominal scale (Honda, Tesla, Audi). Opposed to interval data, which are quantitative, nominal data are qualitative, so comparisons between the variables cannot be described mathematically. Ordinal variables are more useful than nominal ones because they can possibly be ranked, yet they are not quite quantitative. Categorical data analysis is seemingly ubiquitous in statistical practice, and we encourage readers who are interested in a more comprehensive coverage to consult monographs by Agresti (2012) and Simonoff (2003).

At the turn of the nineteenth century, while probabilists in Russia, France, and other parts of the world were hastening the development of statistical theory through probability, British academic researchers achieved great methodological developments in statistics through applications in the biological sciences. This was due in part from the gush of research following Charles Darwin's publication of The Origin of Species in 1859. Darwin's theories ...

Get Nonparametric Statistics with Applications to Science and Engineering with R, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.