Predictions on test sets and plotting a confusion matrix

We have been talking about using the recall metric as our proxy for how effective our predictive model is. Even though recall is still the recall we want to calculate, bear mind in mind that the under-sampled data isn't skewed toward a certain class, which doesn't make the recall metric as critical.

We use this parameter to build the final model with the whole training dataset and predict the classes in the test data:

# datasetlr = LogisticRegression(C = best_c, penalty = 'l1')lr.fit(X_train_undersample,y_train_undersample.values.ravel())y_pred_undersample = lr.predict(X_test_undersample.values)

Here is the compute confusion matrix:

cnf_matrix = confusion_matrix(y_test_undersample,y_pred_undersample) ...

Get Hands-On Machine Learning for Cybersecurity now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.