O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 2. Understanding K-means Clustering

In the previous chapter, we discussed clustering and the different types of clustering in general. In this chapter, we will discuss one of the most popular algorithms in clustering, K-means. We will discuss the following topics in this chapter:

  • Learning K-means
  • Using Mahout to execute K-means
  • Visualizing the K-means cluster using Mahout

Learning K-means

As you cannot do engineering without math, in the same way, you cannot start a clustering discussion without K-means. This is one of the basic and most useful algorithms.

The name of the algorithm is K-means because by using this, we divide the set of data into K-different clusters. So, this algorithm puts a hard limitation on the number of clusters formed. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required