DBSCAN – a density-based clustering technique

Now we will introduce you to DBSCAN, a density-based clustering technique. It's a very simple technique. It selects a random point; if the point is in a dense area (if it has more than N neighbors do), it starts growing the cluster, including all the neighbors, and the neighbors of the neighbors, until it reaches a point where there are no more neighbors.

If the point is not in a dense area, it is classified as noise. Then, another unlabeled point is selected randomly and the process starts over. This technique is great for non-spherical clusters, but it works equally well with spherical ones. The input is just the neighborhood radius (the eps parameter, that is, the maximum distance between two ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.