How to do it...

In the next steps, we demonstrate how to apply the Isolation Forest algorithm to detecting anomalies:

  1. Import the required libraries and set a random seed:
import numpy as npimport pandas as pdrandom_seed = np.random.RandomState(12)
  1. Generate a set of normal observations, to be used as training data:
X_train = 0.5 * random_seed.randn(500, 2)X_train = np.r_[X_train + 3, X_train]X_train = pd.DataFrame(X_train, columns=["x", "y"])
  1. Generate a testing set, also consisting of normal observations:
X_test = 0.5 * random_seed.randn(500, 2)X_test = np.r_[X_test + 3, X_test]X_test = pd.DataFrame(X_test, columns=["x", "y"])
  1. Generate a set of outlier observations. These are generated from a different distribution than the normal ...

Get Machine Learning for Cybersecurity Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.