15.9. Density-Based Algorithms for Large Data Sets

In this framework, clusters are considered as regions in the l-dimensional space that are “dense” in points of X (Note the close agreement between this way of viewing clusters and the definition of clusters given in [Ever 01]). Most of the density-based algorithms do not impose any restrictions to the shape of the resulting clusters. Thus, these algorithms have the ability to recover arbitrarily shaped clusters. In addition, they are able to handle efficiently the outliers. Moreover, the time complexity of these algorithms is lower than O(N2), which makes them eligible for processing large data sets.

Typical density-based algorithms are the DBSCAN ([Este 96]), the DBCLASD ([Xu 98]), and the ...

Get Pattern Recognition, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.