July 2017
Intermediate to advanced
360 pages
8h 26m
English
scikit-learn also allows specifying a connectivity matrix, which can be used as a constraint when finding the clusters to merge. In this way, clusters which are far from each other (non-adjacent in the connectivity matrix) are skipped. A very common method for creating such a matrix involves using the k-nearest neighbors graph function (implemented as kneighbors_graph()), that is based on the number of neighbors a sample has (according to a specific metric). In the following example, we consider a circular dummy dataset (often used in the official documentation also):
from sklearn.datasets import make_circles>>> nb_samples = 3000>>> X, _ = make_circles(n_samples=nb_samples, noise=0.05)
A graphical representation ...
Read now
Unlock full access