Chapter 5Exploratory data analysis
5.1 General remarks
In this chapter, the first steps in any analysis of a compositional data set are addressed. The set is represented as a matrix with rows (observed compositions) and columns (parts). An exploratory analysis includes the following steps:
- computing descriptive statistics, that is, the center and variation matrix of a data set, as well as its total variability;
- looking at the biplot of the data set to discover patterns;
- plotting patterns in ternary diagrams of subcompositions, possibly centered to enhance visualization;
- defining an appropriate representation in orthonormal coordinates and computing the corresponding coordinates; and
- computing classical summary statistics of the coordinates and representing the results in a balance-dendrogram.
In general, the last two steps will be based on a particular sequential binary partition, defined either a priori or as a result of the insights provided by the first three steps.
Before starting, some general considerations need to be made. The first step in a statistical analysis is to check the data set for errors. It can be done using standard procedures, for example, using the minimum ...
Get Modeling and Analysis of Compositional Data now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.