We discussed Canopy clustering in this chapter and found out how to get the initial number of clusters using Canopy clustering. We discussed how the Canopy clustering algorithm works and used the Mahout implementation of Canopy on a text dataset to generate Canopies. We discussed how Canopy clustering is implemented using the MapReduce method. We saw an example class to visualize the Mahout cluster as given in the mahout example class. We also discussed the code to change the CSV file to the vector format that is used by Mahout.
Now, we will move on to the next chapter, where we will discuss the Fuzzy K-means clustering algorithm. This is also a very good topic under clustering algorithms.