O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

We discussed K-Clustering in this chapter. We also discussed how the K-means algorithm works and we used the Mahout implementation of K-means on a text dataset. We downloaded the data and converted it to a Mahout reusable vector format.

We discussed how to understand the cluster using the clusterdumper utility. We saw an example class to visualize the Mahout cluster as given in the Mahout example class.

Now, we will move on to the next chapter, where we will discuss Canopy clustering. This is also a very good technique and can be used to estimate the number of K for K-means clustering.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required