O'Reilly logo

Exploring Data with RapidMiner by Andrew Chisholm

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Relations between examples

Understanding how examples relate to each other is important. This is because examples that are close to one another may be duplicates, so it is worth considering and understanding how they arise and what needs to be done, if anything, about them.

Closeness in this context is some sort of distance measure such as Euclidean distance or cosine similarity. Many possible distances can be calculated using RapidMiner and a brief explanation of Euclidean distance is given in the next section.

The following screenshot shows three data points in two dimensions:

Relations between examples

The points are labeled 1, 2, and 3 and the Euclidean distances between ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required