O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8. Improving Cluster Quality

In the previous chapters, we discussed different clustering techniques and techniques available in Mahout. In this chapter, we will focus on how to evaluate whether our algorithm has performed well or not. First, we will have to understand how our cluster is working, and then we can see where we can improve our cluster. The output of the clustering algorithm is affected by the algorithm, input parameters, and other parameters. The basic idea behind improving cluster quality is to change different parameters, such as the distance measure or input matrix, or check the other parameters that are passed to the input algorithms. So, while we evaluate the cluster, we basically perform the following tasks:

  • Measuring ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required