6.8 Clustering
In Section 4.8 we examined the k-means clustering algorithm in which k initial points are chosen to represent initial cluster centers, all data points are assigned to the nearest one, the mean value of the points in each cluster is computed to form its new cluster center, and iteration continues until there are no changes in the clusters. This procedure only works when the number of clusters is known in advance, and this section begins by describing what you can do if it is not.
Next we take a look at techniques for creating a hierarchical clustering structure by “agglomeration”—that is, starting with individual instances and successively joining them up into clusters. Then we look at a method that works incrementally; that is, ...
Get Data Mining, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.