Intrusion detection and density-based methods

Here is a formal definition of outliers formalized based on concepts such as, LOF, LRD, and so on. Generally speaking, an outlier is a data point biased from others so much that it seems as if it has not been generated from the same distribution functions as others have been.

Given a dataset, D, a DB (x, y)-outlier, p, is defined like this:

Intrusion detection and density-based methods

The k-distance of the p data point denotes the distance between p and the data point, o, which is member of D:

Intrusion detection and density-based methods

The k-distance neighborhood of the p object is defined as ...

Get R: Mining Spatial, Text, Web, and Social Media Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.