June 2017
Beginner to intermediate
576 pages
15h 22m
English
K-means clustering is the most popular algorithm used for partition clustering. In k-means clustering, similarity between clusters is based, in part, upon distance measures. Generally, the goal is to group similar clusters together with each observation having a relatively small distance from other observations in the same clusters. On the other hand, another goal is to have the maximum distance from one cluster to the next, so that the distances can be discerned. It is essentially a balancing act in which there is a tradeoff between defining similarities within groups, and defining opposing dissimilarities between groups.