Chapter 15

Clustering

In This Chapter

arrow Exploring the potentialities of unsupervised clustering

arrow Making K-means work with small and big data

arrow Trying DBScan as an alternative option

One of the basic abilities that humans have exercised since primitive times is to divide the known world into separate classes where individual objects share common features deemed important by the classifier. Starting with primitive cave dwellers classifying the natural world they lived in, distinguishing plants and animals useful or dangerous for their survival, we arrive at modern times in which marketing departments classify consumers into target segments and then act with proper marketing plans.

Classifying is crucial to our process of building new knowledge because, by gathering similar objects, we can

  • Mention all the items in a class by the same denomination
  • Summarize relevant features by an exemplificative class type
  • Associate particular actions or recall specific knowledge automatically

Dealing with big data streams today requires the same classificatory ability, but on a different scale. To spot unknown groups of signals present in the data, we need specialized algorithms that are both able to ...

Get Python for Data Science For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.