O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Validating the regression results

Logistic regression in SparkR lacks some of the cross-validation and other features that you may be used to in base R. However, it is a starting point to enable you to start running large-scale models. If you need to employ some of the cross-validation techniques that have already been covered, you can certainly extract a sample of the data (via collect) and run the regression in base R.

However, there are some techniques that you can use to produce pseudo R-Squares and other diagnostics while continuing to work within Spark, which we will demonstrate.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required