O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Predicting outcomes

SparkR does have a prediction method, aptly named predict(), so we can run predictions on the training set:

  • After running the following code, you will observe that the resulting object contains a new column named prediction
  • We can also add a unary flag (1) to the results (grp) to indicate that the output is from the training data
  • We will also append the total number of rows to each record, since we will need them later for calculations:
        #look at the predictions vs. the training dataset         preds_train <- predict(model, df)         preds_train$grp <- 1         preds_train$totrows = nrow(preds_train) 

The prediction variable is the probability that the outcome of the event (diabetes) will occur:

head(preds_train) 

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required