O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The naive Bayes model

Finally, let's see the impact of changing the lambda parameter for naive Bayes. This parameter controls additive smoothing, which handles the case when a class and feature value do not occur together in the dataset.

See http://en.wikipedia.org/wiki/Additive_smoothing for more details on additive smoothing.

We will take the same approach as we did earlier, first creating a convenience training function and training the model with varying levels of lambda as follows:

def trainNBWithParams(input: RDD[LabeledPoint], lambda: Double) = {   val nb = new NaiveBayes   nb.setLambda(lambda)   nb.run(input) } val nbResults = Seq(0.001, 0.01, 0.1, 1.0, 10.0).map { param =>   val model = trainNBWithParams(dataNB, param)  val scoreAndLabels ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required