Chapter 2

Feature Selection for Clustering: A Review

Salem Alelyani

Arizona State UniversityTempe, AZsalelyan@asu.edu

Jiliang Tang

Arizona State UniversityTempe, AZJiliang.Tang@asu.edu

Huan Liu

Arizona State UniversityTempe, AZhuan.liu@asu.edu

2.1 Introduction

The growth of the high-throughput technologies nowadays has led to exponential growth in the harvested data with respect to dimensionality and sample size. As a consequence, storing and processing these data becomes more challenging. Figure (2.1) shows the trend of this growth for UCI Machine Learning Repository. This augmentation made manual processing for these datasets impractical. Therefore, data mining and machine learning tools were proposed to automate pattern recognition and the ...

Get Data Clustering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.