- Begin by reading in the pickled data:
import picklefile = open('CTU13Scenario1flowData.pickle', 'rb')botnet_dataset = pickle.load(file)
- The data is already split into train-test sets, and you only need assign these to their respective variables:
X_train, y_train, X_test, y_test = ( botnet_dataset[0], botnet_dataset[1], botnet_dataset[2], botnet_dataset[3],)
- Instantiate a decision tree classifier with default parameters:
from sklearn.tree import *clf = DecisionTreeClassifier()
- Fit the classifier to the training data:
clf.fit(X_train, y_train)
- Test it on the test set:
clf.score(X_test, y_test)
The following is the output:
0.9991001799640072