Skip to Content
R in a Nutshell, 2nd Edition
book

R in a Nutshell, 2nd Edition

by Joseph Adler
October 2012
Beginner to intermediate
721 pages
21h 38m
English
O'Reilly Media, Inc.
Content preview from R in a Nutshell, 2nd Edition

Clustering

Another important data mining technique is clustering. Clustering is a way to find similar sets of observations in a data set; groups of similar observations are called clusters. There are several functions available for clustering in R.

Distance Measures

To effectively use clustering algorithms, you need to begin by measuring the distance between observations. A convenient way to do this in R is through the function dist in the stats package:

dist(x, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)

The dist function computes the distance between pairs of objects in another object, such as a matrix or a data frame. It returns a distance matrix (an object of type dist) containing the computed distances. Here is a description of the arguments to dist.

ArgumentDescriptionDefault
xThe object on which to compute distances. Must be a data frame, matrix, or dist object. 
methodThe method for computing distances. Specify method="euclidean" for Euclidean distances (2-norm), method="maximum" for the maximum distance between observations (supremum norm), method="manhattan" for the absolute distance between two vectors (1-norm), method="canberra" for Canberra distances (see the help file), method="binary" to regard nonzero values as 1 and zeros as 0, or method="minkowski" to use the p-norm (the pth root of the sum of the pth powers of the differences of the components)."euclidean"
diagA logical value specifying whether the diagonal of the distance matrix should be printed by print.dist ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

R in a Nutshell

R in a Nutshell

Joseph Adler
The R Book, 2nd Edition

The R Book, 2nd Edition

Michael J. Crawley
The R Book

The R Book

Michael J. Crawley
R Packages

R Packages

Hadley Wickham

Publisher Resources

ISBN: 9781449358204Errata Page