book

Practical Predictive Analytics

June 2017

Beginner to intermediate

576 pages

15h 22m

English

Read now

Unlock full access

Combining the training and test dataset

Next, we will combine the training (grp=1) and testing (grp=0) datasets into one dataframe and manually calculate some accuracy statistics:

preds$error: this is the absolute difference between the outcome (0,1) and the prediction. Recall that for a binary regression model, the prediction represents the probability that the event (diabetes) will occur.
preds$errorsqr: this is the calculated squared error. This is done in order to remove the sign.
preds$correct: in order to classify the probability into correct or not correct, we will compare the error to a .5 cutoff. If the error was small (<- .5) we will call it correct, otherwise it will be considered not correct. This is a somewhat arbitrary cutoff, ...

Alistair Croll, Anna Filippova, Emilie Schario, Lewis Davies, Jacob Frackson, Benn Stancil, Nick Acosta, Elizabeth Caley

Tony Fischetti, Eric Mayor, Rui Miguel Forte

Ashish Kumar, Joseph Babcock

Thomas W. Miller