Measurements extracted from biological systems may be dependent on a large number of variables in manners that are not yet understood. One method of analyzing such data sets is to group data vectors that are similar. Once a group is collected, it can be further analyzed to find the reasons for the similarity. The process of clustering is often used to create these groups, and the most common of these methods is the k-means clustering algorithm. This chapter will focus on the development and use of the k-means method and some useful extensions.
17.1 The Purpose of Clustering
Given a set of data vectors , the object is to group the ...