Using Spark for prediction

In this part of the chapter, the exercise is to use the Spark sample code to create a logistic regression model, save the model, and evaluate the performance of  the model on a test dataset.  For modeling, the features and class labels are specified using the RFormula function. In this example, we will train the model using the pipeline formula and a logistic regression estimator. This can be seen from the following code snippet:

logReg = LogisticRegression(maxIter=10, regParam=0.3, elasticNetParam=0.8)

The following code block sets up the training formula and assigns it to the classFormula variable, which can be seen from the following code:

classFormula = RFormula(formula="tipped ~ pickup_hour + weekday + ...

Get Hands-On Machine Learning with Azure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.