3Clustering Approaches to Networks

Vladimir Batagelj1,2,3

1IMFM, Ljubljana

2IAM, University of Primorska, Koper

3NRU, HSE, Moscow

3.1 Introduction

Clustering and classification are two related activities sometimes used as synonyms. In clustering, the goal is to identify in a given set of units, groups (clusters, classes) of (usually) similar units. In classification a given unit has to be assigned to the corresponding (predefined) group. These two activities are embedded in our language and are therefore basic for most of our daily tasks.

The earliest classification systems were taxonomies of animals and plants: Shen Nung, China, images3000 BCE and Ebers Papyrus, Egypt, images1500 BCE. A theoretical framework was proposed by Aristotle (384–322 BCE). The taxonomic systems were improved by Linnaeus (1707–1778), Darwin (1809–1882), DNA (1953), and PhyloCode (1998).

The first steps towards “numeric” clustering procedures were taken in the first half of 20th century by defining different (dis)similarity measures such as Czekanowski coefficient (1909), coefficient of racial likeness (Pearson, 1926), generalized distance (Mahalanobis, 1936), etc. Early methods were proposed inside biometrics and psychometrics by Driver and Kroeber (1932), Forbes (1933), Zubin (1938), Sturtevant (1939), etc. Kruskal's ...

Get Advances in Network Clustering and Blockmodeling now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.