O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Visualizing clusters

The T Mahout example package provides classes to generate a sample dataset.

For K-means, DisplayKmeans is the class that displays the cluster. You can directly run the class. As per the code in the class, points are generated as follows:

generateSamples(500, 1, 1, 3); // 500 samples of sd 3
generateSamples(300, 1, 0, 0.5); //300 sample of sd 0.5
generateSamples(300, 0, 2, 0.1); //300 sample of sd 0.1

Data is a set of randomly-generated 2D data points, and the points are generated using a normal distribution centered at a mean location with a constant standard deviation.

Once you run this class, you will view the clusters, as shown here:

The final clustering done by the algorithm is shown using a bold red colored circle. In the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required