December 2018
Beginner to intermediate
684 pages
21h 9m
English
Hierarchical clustering provides insight into degrees of similarity among observations as it continues to merge data. A significant change in the similarity metric from one merge to the next suggests a natural clustering existed prior to this point.
The dendrogram visualizes the successive merges as a binary tree, displaying the individual data points as leaves and the final merge as the root of the tree. It also shows how the similarity monotonically decreases from bottom to top. Hence, it is natural to select a clustering by cutting the dendrogram.
The following screenshot (see the hierarchical_clustering notebook for implementation details) illustrates the dendrogram for the classic Iris dataset with four classes ...