January 2018
Intermediate to advanced
470 pages
11h 9m
English
The beauty of clustering algorithms such as K-means is that they do the clustering on the data with an unlimited number of features. They are great tools to use when you have raw data and would like to know the patterns in that data. However, deciding on the number of clusters prior doing the experiment might not be successful and may sometimes lead to an overfitting or underfitting problem.
On the other hand, one common thing to all three algorithms (that is, K-means, bisecting K-means, and Gaussian mixture) is that the number of clusters must be determined in advance and supplied to the algorithm as a parameter. Hence, informally, determining the number of clusters is a separate optimization problem ...
Read now
Unlock full access