April 2016
Beginner to intermediate
384 pages
8h 36m
English
The hierarchical clustering model aims at building a hierarchy of clusters. Conceptually, you might think of it as a decision tree of clusters: based on the similarity (or dissimilarity) between clusters, they are aggregated (or divided) into more general (more specific) clusters. The agglomerative approach is often referred to as bottom up, while the divisive is called top down.
To execute this recipe, you will need pandas, SciPy, and PyLab. No other prerequisites are required.
Hierarchical clustering can be extremely slow for big datasets as the complexity of the agglomerative algorithm is O(n3). To estimate our model, we use a single-linkage algorithm that has better complexity, ...
Read now
Unlock full access