Chapter 2
Detecting Clusters Graphically
2.1 Introduction
Graphical views of multivariate data are important in all aspects of their analysis. In general terms, graphical displays of multivariate data can provide insights into the structure of the data, and in particular, from the point of view of this book, they can be useful for suggesting that the data may contain clusters and consequently that some formal method of cluster analysis might usefully be applied to the data. The usefulness of graphical displays in this context arises from the power of the human visual system in detecting patterns, and a fascinating account of how human observers draw perceptually coherent clusters out of fields of dots is given in Feldman (1995). However, the following caveat from the late Carl Sagan should be kept in mind.
Humans are good at discerning subtle patterns that are really there, but equally so at imagining them when they are altogether absent.
In this chapter we describe a number of relatively simple, static graphical techniques that are often useful for providing evidence for or against possible cluster structure in the data. Most of the methods are based on an examination of either direct univariate or bivariate marginal plots of the multivariate data (i.e. plots obtained using the original variables), or indirect one- or two-dimensional ‘views’ of the data obtained from the application to the data of a suitable dimension-reduction technique, for example principal components analysis. ...