Clustering High-Dimensional Data
University of Alberta Edmonton, Canadazimek@ualberta.ca
The general definition of the task of clustering as to find a set of groups of similar objects within a data set while keeping dissimilar objects separated in different groups or the group of noise is very common. Although Estivill-Castro criticizes this definition for including a grouping criterion , this criterion (similarity) is exactly what is in question among many different approaches. Especially in high-dimensional data, the meaning and definition of similarity is right at the heart of the problem. In many cases, the similarity of objects is assessed within subspaces, e.g., using a subset of the dimensions ...