The notion of outliers is highly related to that of clusters. Clustering-based approaches detect outliers by examining the relationship between objects and clusters. Intuitively, an outlier is an object that belongs to a small and remote cluster, or does not belong to any cluster.
This leads to three general approaches to clustering-based outlier detection. Consider an object.
■ Does the object belong to any cluster? If not, then it is identified as an outlier.
■ Is there a large distance between the object and the cluster to which it is closest? If yes, it is an outlier.
■ Is the object part of a small or sparse cluster? If yes, then all the objects in that cluster are outliers.
Let’s look at examples of each ...