Assessing the performance of a clustering method
Without knowing the true labels, we cannot use the metrics introduced in the previous chapter. In this recipe, we will introduce three measures that will help us assess the effectiveness of our clustering methods: Davis-Bouldin, Pseudo-F (sometimes referred to as Calinski-Harabasz), and Silhouette Score are internal evaluation metrics. In contrast, if we knew the true labels, we could use a range of measures, such as Adjusted Rand Index, Homogeneity, or Completeness scores, to name a few.
Note
Refer to the documentation of Scikit on clustering methods for a deeper overview of various external evaluation metrics of clustering methods:
http://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access