A clustering problem consists in the selection and grouping of homogeneous items from a set of initial data. To solve this problem, we must:
- Identify a resemblance measure between elements
- Find out if there are subsets of elements that are similar to the measure chosen
The algorithm determines which elements form a cluster and what degree of similarity unites them within the cluster.
The clustering algorithms fall into the unsupervised methods, because we do not assume any prior information on the structures and characteristics of the clusters.
The k-means algorithm
One of the most common and simple clustering algorithms is k-means, which allows subdividing groups of objects into k partitions on the basis of their attributes. Each cluster ...