November 2019
Intermediate to advanced
346 pages
9h 36m
English
In the following steps, we will demonstrate several methods for dealing with imbalanced data:
from sklearn import treefrom sklearn.metrics import balanced_accuracy_scoreimport numpy as npimport scipy.sparseimport collectionsX_train = scipy.sparse.load_npz("training_data.npz")y_train = np.load("training_labels.npy")X_test = scipy.sparse.load_npz("test_data.npz")y_test = np.load("test_labels.npy")
dt = tree.DecisionTreeClassifier()dt.fit(X_train, y_train)dt_pred = dt.predict(X_test)print(collections.Counter(dt_pred))print(balanced_accuracy_score(y_test, ...