Chapter 5Exploratory data analysis

5.1 General remarks

In this chapter, the first steps in any analysis of a compositional data set are addressed. The set is represented as a matrix c05-math-0001 with c05-math-0002 rows (observed compositions) and c05-math-0003 columns (parts). An exploratory analysis includes the following steps:

  1. computing descriptive statistics, that is, the center and variation matrix of a data set, as well as its total variability;
  2. looking at the biplot of the data set to discover patterns;
  3. plotting patterns in ternary diagrams of subcompositions, possibly centered to enhance visualization;
  4. defining an appropriate representation in orthonormal coordinates and computing the corresponding coordinates; and
  5. computing classical summary statistics of the coordinates and representing the results in a balance-dendrogram.

In general, the last two steps will be based on a particular sequential binary partition, defined either a priori or as a result of the insights provided by the first three steps.

Before starting, some general considerations need to be made. The first step in a statistical analysis is to check the data set for errors. It can be done using standard procedures, for example, using the minimum ...

Get Modeling and Analysis of Compositional Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.