February 2018
Intermediate to advanced
378 pages
10h 14m
English
Score function calculates accuracy of the model using the data. Let's calculate the accuracy of our model on the training set:
In []: tree_model.score(X_train, y_train) Out[]: 1.0
Wow, looks like our model is 100% accurate. Isn't it a great result? Let's not hurry and check our model on held-out data. Evaluation on the test set is the golden standard of success in machine learning:
In []: tree_model.score(X_test, y_test) Out[]: 0.87666666666666671
Worse now. What's just happened? Here, the first time we were faced with the problem of overfitting, when the model is trying to fit itself to every quirk in the data. Our model adjusted itself to the training data so much, that on the previously unseen data, it lacks the ability ...
Read now
Unlock full access