July 2017
Intermediate to advanced
254 pages
6h 29m
English
If k is not specified by the problem's context, the optimal number of clusters can be estimated using a technique called the elbow method. The elbow method plots the value of the cost function produced by different values of k. As k increases, the average distortion will decrease; each cluster will have fewer constituent instances, and the instances will be closer to their respective centroids. However, the improvements to the average dispersion will decline as k increases. The value of k at which the improvement to the dispersion declines the most is called the elbow. Let's use the elbow method to choose the number of clusters for a dataset. The following scatter plot visualizes a dataset with two obvious ...
Read now
Unlock full access