Chapter 20

Semisupervised Clustering

Amrudin Agovic

Reliancy, LLCSaint Louis Park, MNaagovic@cs.umn.edu

Arindam Banerjee

University of Minnesota at Twin Cities,Minneapolis, MNbanerjee@cs.umn.edu

20.1 Introduction

Semisupervised clustering (SSC) has become an important part of data mining. With an ever increasing volume of data in several problem domains, it is more important than ever to leverage known information and observed relationships among data points to guide clustering.

Clustering methods are broadly divided into two groups depending on the data representation they use: feature-based, where each data point has a representation in terms of a feature vector or a structured representation such as sequence, time series, or graphs, and

Get Data Clustering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.