The silhouette method requires that we first compute silhouette scores for each data point. The silhouette score for a single data point is the average dissimilarity of the data point with all other data points in the next-nearest cluster, minus the average dissimilarity of the data point to points in the same cluster, divided by the larger of these two numbers. This is represented using the following formula:
We plot the average silhouette score for all data points in the dataset. A large silhouette score, close to 1, means that the data point is dissimilar to data points in other clusters, and the data point seems to ...