November 2016
Beginner to intermediate
941 pages
21h 55m
English
K-means is an unsupervised algorithm that creates K disjoint clusters of points with equal variance, minimizing the distortion (also named inertia).
Given only one parameter K, representing the number of clusters to be created, the K-means algorithm creates K sets of points S1, S2, …, SK, each of them represented by its centroid: C1, C2, …, CK. The generic centroid, Ci, is simply the mean of the samples of the points associated to the cluster Si in order to minimize the intra-cluster distance. The outputs of the system are as follows: