O'Reilly logo

Fast Data Processing with Spark 2 - Third Edition by Krishna Sankar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hyper parameters

We have glossed over an important aspect: model tuning. As you can see, there are many parameters that can be tuned, depending on the algorithm. And we have been setting the parameters once. For example, in the case of the recommender, we set rank=12, regularizationParameter=0.1, and maxIterations=20. In reality, the rank could be 8 or 12; the regularization parameter 0.1,1.0, or 10; and the iterations 10 or 20. So now we need to try 12 runs with these different values, calculate the accuracy, and then select the one with the best value. This is a simple case; we might have more than 100 runs and many parameters. This is where cross validation comes into the picture. To keep this book within its boundaries, I will leave this part ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required