How to do it...

In the next steps, we demonstrate how to apply the Isolation Forest algorithm to detecting anomalies:

  1. Import the required libraries and set a random seed:
import numpy as npimport pandas as pdrandom_seed = np.random.RandomState(12)
  1. Generate a set of normal observations, to be used as training data:
X_train = 0.5 * random_seed.randn(500, 2)X_train = np.r_[X_train + 3, X_train]X_train = pd.DataFrame(X_train, columns=["x", "y"])
  1. Generate a testing set, also consisting of normal observations:
X_test = 0.5 * random_seed.randn(500, 2)X_test = np.r_[X_test + 3, X_test]X_test = pd.DataFrame(X_test, columns=["x", "y"])
  1. Generate a set of outlier observations. These are generated from a different distribution than the normal ...

Get Machine Learning for Cybersecurity Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.