Chapter 10Coding and Multiple Correspondence Analysis

10.1 Introduction to Coding

When considering two categorical variables, performing any variant of correspondence analyses can be done easily, as described in the previous chapters. However, the situation can become more complicated when three or more categorical variables are considered because the data no longer exist in the form of a two-way matrix, or two-way table. Instead, the data exist as a multi-way table that (beyond the case where one is considering three variables) can be difficult to visualise. Therefore, we may code the data in such a way that what was arranged in the form of a multi-way array exists more simply as a two-way matrix where any number of simple correspondence analysis techniques can be considered.

By coding, we mean transforming the data based on detailed prior knowledge of multiple variables and reducing the number of different categories by assigning (objectively) values and mapping them onto a limited number of subsets with specific characteristics (Van Rijckevorsel, 1987). There are many other ways in which data coding can be defined. Bourque (2004) says that coding is

a systematic way by which to condense extensive data sets into smaller analysable units through the creation of categories derived from the data,

while Lockyer (2004) describes it as

the process or function by which verbal data are converted into variables and categories of variables using numbers.

It is clear then there ...

Get Correspondence Analysis: Theory, Practice and New Strategies now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.