Skip to Main Content
Hands-On Unsupervised Learning with Python
book

Hands-On Unsupervised Learning with Python

by Giuseppe Bonaccorso
February 2019
Intermediate to advanced content levelIntermediate to advanced
386 pages
9h 54m
English
Packt Publishing
Content preview from Hands-On Unsupervised Learning with Python

Anomaly detection with the KDD Cup 99 dataset

This example is based on the KDD Cup 99 dataset, which collects a long series of normal and malicious internet activities. In particular, we are going to focus on the subset of HTTP requests, which has four attributes: duration, source bytes, destination bytes, and behavior (which is more a classification element, but it's helpful for us to have immediate access to some specific attacks). As the original values were very small numbers around zero, all versions (included the scikit-learn one) renormalize the variables, using the formula log(x + 0.1) (hence, it must be applied when simulating the anomaly detection with new samples). Of course, the inverse transformation is as follows:

Let's start ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Hands-On Unsupervised Learning Using Python

Hands-On Unsupervised Learning Using Python

Ankur A. Patel
Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido

Publisher Resources

ISBN: 9781789348279Supplemental Content