We have been talking about using the recall metric as our proxy for how effective our predictive model is. Even though recall is still the recall we want to calculate, bear mind in mind that the under-sampled data isn't skewed toward a certain class, which doesn't make the recall metric as critical.
We use this parameter to build the final model with the whole training dataset and predict the classes in the test data:
# datasetlr = LogisticRegression(C = best_c, penalty = 'l1')lr.fit(X_train_undersample,y_train_undersample.values.ravel())y_pred_undersample = lr.predict(X_test_undersample.values)
Here is the compute confusion matrix:
cnf_matrix = confusion_matrix(y_test_undersample,y_pred_undersample) ...