13.6. Choice of the Best Number of Clusters

So far, we have focused on various hierarchical algorithms. In the sequel we turn our attention to the important task of determining the best clustering within a given hierarchy. Clearly, this is equivalent to identifying the number of clusters that best fits the data. An intuitive approach is to search in the proximity dendrogram for clusters that have a large lifetime. The lifetime of a cluster is defined as the absolute value of the difference between the proximity level at which it is created and the proximity level at which it is absorbed into a larger cluster. For example, the dendrogram of Figure 13.17a suggests that two major clusters are present and that of Figure 13.17b suggests only one. ...

Get Pattern Recognition, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.