The K-Means algorithm is characterized by the following steps:
- Initialization: This is the phase in which the centroids are identified on the basis of the number of clusters defined by the analyst (usually, we are not able to know the number of real clusters in advance, so it is often necessary to proceed by trial and error when defining the number of clusters).
- Data assignment to the clusters: Based on the definition of the centroids carried out in the initialization phase, the data is assigned to the closest cluster, on the basis of the minimum Euclidean distance calculated between the data and their respective centroids.
- Centroids update: Being an iterative process, the K-Means algorithm proceeds again to the estimation ...