The T Mahout example package provides classes to generate a sample dataset.
DisplayKmeans is the class that displays the cluster. You can directly run the class. As per the code in the class, points are generated as follows:
generateSamples(500, 1, 1, 3); // 500 samples of sd 3 generateSamples(300, 1, 0, 0.5); //300 sample of sd 0.5 generateSamples(300, 0, 2, 0.1); //300 sample of sd 0.1
Data is a set of randomly-generated 2D data points, and the points are generated using a normal distribution centered at a mean location with a constant standard deviation.
Once you run this class, you will view the clusters, as shown here:
The final clustering done by the algorithm is shown using a bold red colored circle. In the ...