December 2018
Beginner to intermediate
684 pages
21h 9m
English
Finally, we would like to evaluate the best model's performance on the holdout set that we excluded from the GridSearchCV exercise. It contains the last six months of the sample period (through February 2018; see the notebook for details). We obtain a generalization performance estimate based on the AUC score of 0.6622 using the following code:
best_model = gridsearch_result.best_estimator_preds= best_model.predict(test_feature_data)roc_auc_score(y_true=test_target, y_score=preds)0.6622
The downside of the sklearn gradient boosting implementation is the limited speed of computation which makes it difficult to try out different hyperparameter settings quickly. In the next section, we will see that several optimized ...