Simon Fraser UniversityBritish Columbia,Canadaester@cs.sfu.ca
Many of the well-known clustering algorithms make, implicitly or explicitly, the assumption that data are generated from a probability distribution of a given type, e.g., from a mixture of k Gaussian distributions. This is the case in particular for EM (Expectation Maximization) clustering and for k-means. Due to this assumption, these algorithms produce spherical clusters and cannot deal well with datasets in which the actual clusters have nonspherical shapes. Nonspherical clusters occur naturally in spatial data, i.e., data with a reference to some two- or three-dimensional concrete space corresponding to our real world. ...