December 2018
Beginner to intermediate
684 pages
21h 9m
English
Density-based spatial clustering of applications with noise (DBSCAN) was developed in 1996, and received the Test of Time award at the 2014 KDD conference because of the attention it has received in both theory and practice.
It aims to identify core and non-core samples, where the former extend a cluster and the latter are part of a cluster but do not have sufficient nearby neighbors to further grow the cluster. Other samples are outliers and not assigned to any cluster.
It uses an eps parameter for the radius of the neighborhood and min_samples for the number of members required for core samples. It is deterministic and exclusive and has difficulties with clusters of different density and high-dimensional data. It can be challenging ...