July 2017
Intermediate to advanced
360 pages
8h 26m
English
To better understand the agglomeration process, it's useful to introduce a graphical method called a dendrogram, which shows in a static way how the aggregations are performed, starting from the bottom (where all samples are separated) till the top (where the linkage is complete). Unfortunately, scikit-learn doesn't support them. However, SciPy (which is a mandatory requirement for it) provides some useful built-in functions.
Let's start by creating a dummy dataset:
from sklearn.datasets import make_blobs>>> nb_samples = 25>>> X, Y = make_blobs(n_samples=nb_samples, n_features=2, centers=3, cluster_std=1.5)
To avoid excessive complexity in the resulting plot, the number of samples has been kept very low. In the following figure, ...
Read now
Unlock full access