Intrusion detection and density-based methods

Here is a formal definition of outliers formalized based on concepts such as, LOF, LRD, and so on. Generally speaking, an outlier is a data point biased from others so much that it seems as if it has not been generated from the same distribution functions as others have been.

Given a dataset, D, a DB (x, y)-outlier, p, is defined like this:

Intrusion detection and density-based methods

The k-distance of the p data point denotes the distance between p and the data point, o, which is member of D:

Intrusion detection and density-based methods

The k-distance neighborhood of the p object is defined as ...

Get R: Mining Spatial, Text, Web, and Social Media Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.