How to do it…

  1. Begin by reading in the pickled data:
import picklefile = open('CTU13Scenario1flowData.pickle', 'rb')botnet_dataset = pickle.load(file)
  1. The data is already split into train-test sets, and you only need assign these to their respective variables:
X_train, y_train, X_test, y_test = (    botnet_dataset[0],    botnet_dataset[1],    botnet_dataset[2],    botnet_dataset[3],)
  1. Instantiate a decision tree classifier with default parameters:
from sklearn.tree import *clf = DecisionTreeClassifier()
  1. Fit the classifier to the training data:
clf.fit(X_train, y_train)
  1. Test it on the test set:
clf.score(X_test, y_test)

The following is the output:

0.9991001799640072

Get Machine Learning for Cybersecurity Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.