O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mahout implementation of spectral clustering

The Mahout implementation of spectral clustering requires an affinity matrix as the input from the user, and it uses the K-means algorithm for the final clustering. Usually, Mahout clustering consists of the following steps:

  1. User takes a matrix of k*n-dimensional data to which he wants to cluster.
  2. User will have to create a similarity matrix from the original data matrix. This will be a k*k transformation of the original matrix based on how the points are related to each other.
  3. From the similarity matrix, an affinity matrix needs to be created. Mahout takes a type of Hadoop-backed affinity matrix as an input in the form of a text file. This is a weighted, undirected graph. Each line of a text file represents ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required