Intrusion detection and density-based methods

Here is a formal definition of outliers formalized based on concepts such as, LOF, LRD, and so on. Generally speaking, an outlier is a data point biased from others so much that it seems as if it has not been generated from the same distribution functions as others have been.

Given a dataset, D, a DB (x, y)-outlier, p, is defined like this:

Intrusion detection and density-based methods

The k-distance of the p data point denotes the distance between p and the data point, o, which is member of D:

Intrusion detection and density-based methods

The k-distance neighborhood of the p object is defined as ...

Get R: Data Analysis and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.