Cluster quality metrics help select among alternative clustering results. The kmeans_evaluation notebook illustrates the following options:
- The k-Means objective function suggests we compare the evolution of the inertia or within-cluster variance.
- Initially, additional centroids decrease the inertia sharply because new clusters improve the overall fit.
- Once an appropriate number of clusters has been found (assuming it exists), new centroids reduce the within-cluster variance by much less as they tend to split natural groupings.
- Hence, when k-Means finds a good cluster representation of the data, the inertia tends to follow an elbow-shaped path similar to the explained variance ratio for PCA, as shown in the following ...