O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Combining the training and test dataset

Next, we will combine the training (grp=1) and testing (grp=0) datasets into one dataframe and manually calculate some accuracy statistics:

  • preds$error: this is the absolute difference between the outcome (0,1) and the prediction. Recall that for a binary regression model, the prediction represents the probability that the event (diabetes) will occur.
  • preds$errorsqr: this is the calculated squared error. This is done in order to remove the sign.
  • preds$correct: in order to classify the probability into correct or not correct, we will compare the error to a .5 cutoff. If the error was small (<- .5) we will call it correct, otherwise it will be considered not correct. This is a somewhat arbitrary cutoff, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required