February 2018
Intermediate to advanced
378 pages
10h 14m
English
If you don't know in advance how many clusters you have, then how do you choose the optimal k? This is essentially an egg-and-chicken problem. Several approaches are popular and we'll discuss one of them: the elbow method.
Do you remember those mysterious WCSS that we calculated on every iteration of k-means? This measure tells us how much points in every cluster are different from their centroid. We can calculate it for several different k values and plot the result. It usually looks somewhat similar to the plot on the following graph:

This plot should ...
Read now
Unlock full access