Understanding how examples relate to each other is important. This is because examples that are close to one another may be duplicates, so it is worth considering and understanding how they arise and what needs to be done, if anything, about them.
Closeness in this context is some sort of distance measure such as Euclidean distance or cosine similarity. Many possible distances can be calculated using RapidMiner and a brief explanation of Euclidean distance is given in the next section.
The following screenshot shows three data points in two dimensions:
The points are labeled 1, 2, and 3 and the Euclidean distances between ...