March 2017
Beginner to intermediate
866 pages
18h 4m
English
Deciding the optimum value of K is one of the tough parts while performing a k-means clustering. There are a few methods that can be used to do this.
We earlier discussed that a good cluster is defined by the compactness between the observations of that cluster. The compactness is quantified by something called intra-cluster distance. The intra-cluster distance for a cluster is essentially the sum of pair-wise distances between all possible pairs of points in that cluster.
If we denote intra-cluster distance by W, then for a cluster k intra-cluster, the distance can be denoted by:

Generally, the normalized ...