The execution of the k-means algorithm involves the following steps:
- Randomly select k observations from the dataset as the initial cluster centroids.
- For each observation in the dataset, perform the following:
- Compute the distance between the observation and each of the cluster centroids.
- Identify the cluster centroid that has minimum distance with the observation.
- Assign the observation to such closest centroid.
- With all points assigned to one of the cluster centroids, compute new cluster centroids. This can be done by taking the mean of all the points assigned to a cluster.
- Perform step 2 and step 3 repeatedly until the cluster centroids (mean) do not change or until a user-defined number ...