November 2017
Intermediate to advanced
374 pages
10h 19m
English
k-means is actually a very simple algorithm that works to minimize the within-cluster sum of squares of distances from the mean. We'll be minimizing the sum of squares yet again!
It does this by first setting a prespecified number of clusters, K, and then alternating between the following:
This happens until some specified criterion is met. Centroids are difficult to interpret, and it can also be very difficult to determine whether we have the correct number of centroids. It's important to understand whether your data is unlabeled or not as this will directly influence the evaluation ...
Read now
Unlock full access