O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hierarchical clustering

Hierarchical clustering is a structured clustering approach that results in a multilevel hierarchy of clusters, where each cluster might contain many subclusters (or child clusters). Each child cluster is, thus, linked to the parent cluster. This form of clustering is often also called tree clustering.

Agglomerative clustering is a bottom-up approach where:

  • Each data point begins in its own cluster
  • The similarity (or distance) between each pair of clusters is evaluated
  • The pair of clusters that are most similar are found; this pair is then merged to form a new cluster
  • The process is repeated until only one top-level cluster remains

Divisive clustering is a top-down approach that works in reverse, starting with one ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required