O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

In this chapter, we discussed how to improve cluster quality. We looked at different measuring techniques that help us to identify cluster quality. We further discussed intrinsic and extrinsic methods for cluster evaluation techniques. Then, we saw how to use inter-cluster distance measure to calculate the Dunn index. We also discussed custom distance measure in Mahout. A wrong selection of distance measure can affect the quality of clusters badly. In the next, and final, chapter of this book, we will use Hadoop to run our clustering job and see how to go for clustering in production.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required