Clustering using k-means

Cluster analysis or clustering is the process of grouping data into multiple groups so that the data in one group is similar to the data in other groups.

The following are a few examples where clustering is used:

  • Market segmentation: Dividing the target market into multiple segments so that the needs of each segment can be served better
  • Social network analysis: Finding a coherent group of people in the social network for ad targeting through a social networking site such as Facebook
  • Data center computing clusters: Putting a set of computers together to improve performance
  • Astronomical data analysis: Understanding astronomical data and events such as galaxy formations
  • Real estate: Identifying neighborhoods based on similar features ...

Get Spark Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.