O'Reilly logo

Exploring Data with RapidMiner by Andrew Chisholm

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Automated detection of example outliers

Four different outlier detection techniques are described in the following sections. They are as follows:

  • Detect Outlier (Distances)
  • Detect Outlier (Densities)
  • Detect Outlier (LOF)
  • Detect Outlier (COF)

None of these algorithms will automatically find the correct outliers for the data being explored. Given their parameters, they will flag up candidate outlier points to allow a person to get involved and make the final determination. This is an important point that needs to be built into any data exploration process.

Detect Outlier (Distances)

The simplest operator is Detect Outlier (Distances). Each example is considered in turn and the distance to the kth nearest neighbor is determined (k is provided as a parameter). ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required