O'Reilly logo

Exploring Data with RapidMiner by Andrew Chisholm

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Outliers

Outliers are always present in real data and this chapter gives an introduction to help detect and deal with them. An outlier is an observation that does not fit with others. Mathematically, an outlier can be considered numerically distant from other points. They can arise in different ways through measurement error, or they can be present simply because of the distribution of data. It is common that real data contains outliers and their presence can affect the results of a data mining exercise adversely. Having said that, some data exploration activities look for outliers; fraud detection is one example. It is, therefore, very important to identify them, work out why they happen, and what to do about them.

The basic, obvious ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required