O'Reilly logo

Data Clustering by Chandan K. Reddy, Charu C. Aggarwal

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 12

Clustering Categorical Data

Bill Andreopoulos

Lawrence Berkeley National LaboratoryBerkeley, CAbillandreo@gmail.com

12.1 Introduction

A growing number of clustering algorithms for categorical data have been proposed in recent years, along with interesting applications, such as partitioning large software systems [8, 9] and protein interaction data [13, 21, 38, 77].

A categorical dataset with m attributes is viewed as an m-dimensional “cube”, offering a spatial density basis for clustering. A cell of the cube is mapped to the number of objects having values equal to its coordinates. Clusters in such a cube are regarded as subspaces of high object density and are separated by subspaces of low object density. Clustering the cube poses ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required