Here is a formal definition of outliers formalized based on concepts such as, LOF, LRD, and so on. Generally speaking, an outlier is a data point biased from others so much that it seems as if it has not been generated from the same distribution functions as others have been.

Given a dataset, `D`

, a DB (x, y)-outlier, `p`

, is defined like this:

The k-distance of the `p`

data point denotes the distance between `p`

and the data point, `o`

, which is member of `D`

:

The k-distance neighborhood of the `p`

object is defined as ...

Start Free Trial

No credit card required