If we do not have a gold standard set of labels for our clusters for comparison, we are stuck with evaluating how well our clustering technique performs using internal criteria. In other words, we can still evaluate our clustering by making similarity and dissimilarity measurements within the clusters themselves.
The first of these internal metrics that we will present here is called the silhouette coefficient. The silhouette coefficient can be calculated for each clustered data point as follows:
Here, a is the mean distance between a data point and all other points in the same cluster (the Euclidean distance, ...