Chapter 3

Probabilistic Models for Clustering

Hongbo Deng

University of Illinois at Urbana-ChampaignUrbana, ILhbdeng@illinois.edu

Jiawei Han

University of Illinois at Urbana-ChampaignUrbana, ILhanj@illinois.edu

3.1 Introduction

Probabilistic model-based clustering techniques have been widely used and have shown promising results in many applications, ranging from image segmentation [71, 15], handwriting recognition [60], document clustering [36, 81], topic modeling [35, 14] to information retrieval [43]. Model-based clustering approaches attempt to optimize the fit between the observed data and some mathematical model using a probabilistic approach. Such methods are often based on the assumption that the data are generated by a mixture of underlying ...

Get Data Clustering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.