July 2017
Intermediate to advanced
796 pages
18h 55m
English
Suppose we have n data points xi, i=1...n that need to be partitioned into k clusters. Now that the target here is to assign a cluster to each data point. K-means then aims to find the positions μi,i=1...k of the clusters that minimize the distance from the data points to the cluster. Mathematically, the K-means algorithm tries to achieve the goal by solving the following equation, that is, an optimization problem:

In the preceding equation, ci is the set of data points assigned to cluster i, and d(x,μi) =||x−μi||22 is the Euclidean distance to be calculated (we will explain why we should use this distance ...
Read now
Unlock full access