## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

# Chapter 16Categorical Data Analysis

Package(s): `gdata`

Dataset(s): `UCBAdmissions`, `Titanic`, `HairEyeColor`, `VADeaths`, `faithful`, `atombomb`, `Filariasistype`

## 16.1 Introduction

Discrete data may be classified into two forms: (i) nominal data, and (ii) ordinal data. Nominal data consists of variables which have labels. For example, the variable gender consists of two labels, male and female. As such, though we may denote males by 0 and females by 1, it is not the case here that 1 is greater than 0, and thus the name for the variable is a nominal variable. On the other hand, if we consider the rank of a student on the basis of marks, the first rank signifies more value than the second rank. Such variables are called ordinal variables. Categorical data analysis is concerned about analysis of these kind of variables.

Categorical Data Analysis, abbreviated as CDA, requires data to be entered in a specific format, viz., the contingency tables. In particular, in R, the data has to be read in a table format. Some of the standard datasets, for CDA, shipped along with R software include `UCBAdmissions`, `Titanic`, `HairEyeColor`, and `VADeaths`. Note that earlier datasets, such as `iris`, are of the class `data.frame`. The above-mentioned datasets are of the class `table` or `matrix`, as can be verified in the next (small) program.

``````> class(UCBAdmissions);class(Titanic);class(HairEyeColor)