The K-Means machine learning technique

K-Means is one of the most popular unsupervised machine learning techniques that is used to create clusters, and so categorizes data.

An intuitive example could be posed as follows:

Say a university was offering a new course on American History and Asian History. The university maintains a 15:1 student-teacher ratio, so there is 1 teacher per 15 students. It has conducted a survey which contains a 10-point numeric score that was assigned by each student to their preference of studying American History or Asian History.

We can use the in-built K-Means algorithm in R to create 2 clusters and presumably, by the number of points in each cluster, it may be possible to get an estimate of the number of students ...

Get Practical Big Data Analytics now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.