This algorithm (whose name stands for Balanced Iterative Reducing and Clustering using Hierarchies) has slightly more complex dynamics than mini-batch K-means and the final part employs a method (hierarchical clustering) that we are going to present in Chapter 4, Hierarchical Clustering in Action. However, for our purposes, the most important part concerns the data preparation phase, which is based on a particular tree structure called Clustering or Characteristic-Feature Tree (CF-Tree). Given a dataset X, every node of the tree is made up of a tuple of three elements:
The characteristic elements are respectively the number of sample ...