Finding a common center - K-means

Here we go! After some necessary preparation review, we will finally start to learn from data; in this case, we are looking to label data we observe in real life.

In this case, we have the following elements:

  • A set of N-dimensional elements of numeric type
  • A predetermined number of groups (this is tricky because we have to make an educated guess)
  • A set of common representative points for each group (called centroids)

The main objective of this method is to split the dataset into an arbitrary number of clusters, each of which can be represented by the mentioned centroids.

The word centroid comes from the mathematics world, and has been translated to calculus and physics. Here we find a classical representation ...

Get Machine Learning for Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.