BIRCH

This algorithm (whose name stands for Balanced Iterative Reducing and Clustering using Hierarchies) has slightly more complex dynamics than mini-batch K-means and the final part employs a method (hierarchical clustering) that we are going to present in Chapter 4, Hierarchical Clustering in Action. However, for our purposes, the most important part concerns the data preparation phase, which is based on a particular tree structure called Clustering or Characteristic-Feature Tree (CF-Tree). Given a dataset X, every node of the tree is made up of a tuple of three elements:

The characteristic elements are respectively the number of sample ...

Get Hands-On Unsupervised Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.