O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Clustering using GMM

We will create clusters for both users and items (movies in this case) to get a better understanding of how the algorithm groups users and items.

Perform the following steps:

  1. Load the libsvm file for users.
  2. Create a Gaussian Mixture instance. The instance has the following parameters which can be configured:
       final val featuresCol: Param[String]        Param for features column name.        final val k: IntParam        Number of independent Gaussians in the mixture model.        final val        maxIter: IntParam        Param for maximum number of iterations (>= 0).        final val predictionCol: Param[String]        Param for prediction column name.        final val probabilityCol: Param[String]        Param for Column name for predicted class conditional  probabilities. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required