O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

We discussed Canopy clustering in this chapter and found out how to get the initial number of clusters using Canopy clustering. We discussed how the Canopy clustering algorithm works and used the Mahout implementation of Canopy on a text dataset to generate Canopies. We discussed how Canopy clustering is implemented using the MapReduce method. We saw an example class to visualize the Mahout cluster as given in the mahout example class. We also discussed the code to change the CSV file to the vector format that is used by Mahout.

Now, we will move on to the next chapter, where we will discuss the Fuzzy K-means clustering algorithm. This is also a very good topic under clustering algorithms.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required