Chapter 23

Clustering Validation Measures

Hui Xiong

Rutgers, The State University of New JerseyNewark, NJ 07102hxiong@rutgers.edu

Zhongmou Li

Rutgers, The State University of New Jersey Newark, NJ 07102mosesli@pegasus.rutgers.edu

23.1 Introduction

Clustering, one of the most important unsupervised learning problems, is the task of dividing a set of objects into clusters such that objects within the same cluster are similar while objects in different clusters are distinct. Clustering is widely used in many fields, such as text mining, image analysis, and bioinformatics [16, 69, 17]. As an unsupervised learning task, it is necessary to find a way to validate the goodness of partitions after clustering. Otherwise, it would be difficult to make use ...

Get Data Clustering now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.