October 2019
Beginner to intermediate
498 pages
14h 13m
English
The basic K-means cluster analysis algorithm is easy to implement, but it can lead to a number of problems. We briefly describe a few of these problems here and leave the solutions for you as exercises.
Step 2 of our algorithm required that we pick k data points to serve as the initial centroids. Our solution was to use a random selection, but this meant that two runs of the program would likely produce different results. It seems intuitive that by choosing the centroids in a more intentional manner, we can guide the way in which the clusters are ultimately constructed. This determination could be based either on user input or on data analysis.
It is possible that clusters may become empty as ...
Read now
Unlock full access