O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Training a model on the MovieLens 100k dataset

We're now ready to train our model! The other inputs required for our model are as follows:

  • rank: This refers to the number of factors in our ALS model, that is, the number of hidden features in our low-rank approximation matrices. Generally, the greater the number of factors, the better, but this has a direct impact on memory usage, both for computation and to store models for serving, particularly for large numbers of users or items. Hence, this is often a trade-off in real-world use cases. It also impacts the amount of training data required.
  • A rank in the range of 10 to 200 is usually reasonable.
  • iterations: This refers to the number of iterations to run. While each iteration in ALS is guaranteed ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required