Cluster instability as a performance metric

Cluster instability is a method proposed by Von Luxburg (in Cluster stability: an overview, Von Luxburg U., arXiv 1007:1075v1, 2010) that can measure the goodness of an algorithm with respect to a specific dataset. It can be employed for different purposes (for example, tuning hyperparameters or finding the optimal number of clusters) and it's relatively easy to compute. The method is based on the idea that a clustering result meeting the requirements of maximum cohesion and separation should also be robust to noisy perturbations of the dataset. In other words, if dataset X has been segmented into cluster set C, a derived dataset Xn (based on small perturbations of the features) should be mapped ...

Get Hands-On Unsupervised Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.