Book description
Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data mining, and machine learning communities. Addressing this problem in a unified way, Data Clustering: Algorithms and Applications provides complete coverage of the entire area of clustering, from basic methods to more refined and complex data clustering approaches. It pays special attention to recent issues in graphs, social networks, and other domains.
The book focuses on three primary aspects of data clustering:
- Methods, describing key techniques commonly used for clustering, such as feature selection, agglomerative clustering, partitional clustering, density-based clustering, probabilistic clustering, grid-based clustering, spectral clustering, and nonnegative matrix factorization
- Domains, covering methods used for different domains of data, such as categorical data, text data, multimedia data, graph data, biological data, stream data, uncertain data, time series clustering, high-dimensional clustering, and big data
- Variations and Insights, discussing important variations of the clustering process, such as semisupervised clustering, interactive clustering, multiview clustering, cluster ensembles, and cluster validation
In this book, top researchers from around the world explore the characteristics of clustering problems in a variety of application areas. They also explain how to glean detailed insight from the clustering process—including how to verify the quality of the underlying clusters—through supervision, human intervention, or the automated generation of alternative clusters.
Table of contents
- Cover
- Preliminaries
- Preface
- Editor Biographies
- Contributors
-
Chapter 1 An Introduction to Cluster Analysis
- 1.1 Introduction
- 1.2 Common Techniques Used in Cluster Analysis
- 1.3 Data Types Studied in Cluster Analysis
- 1.4 Insights Gained from Different Variations of Cluster Analysis
- 1.5 Discussion and Conclusions
- Bibliography
- Chapter 2 Feature Selection for Clustering: A Review
- Chapter 3 Probabilistic Models for Clustering
-
Chapter 4 A Survey of Partitional and Hierarchical Clustering Algorithms
- 4.1 Introduction
-
4.2 Partitional Clustering Algorithms
- 4.2.1 K-Means Clustering
- 4.2.2 Minimization of Sum of Squared Errors
- 4.2.3 Factors Affecting K-Means
-
4.2.4 Variations of K-Means
- 4.2.4.1 K-Medoids Clustering
- 4.2.4.2 K-Medians Clustering
- 4.2.4.3 K-Modes Clustering
- 4.2.4.4 Fuzzy K-Means Clustering
- 4.2.4.5 X-Means Clustering
- 4.2.4.6 Intelligent K-Means Clustering
- 4.2.4.7 Bisecting K-Means Clustering
- 4.2.4.8 Kernel K-Means Clustering
- 4.2.4.9 Mean Shift Clustering
- 4.2.4.10 Weighted K-Means Clustering
- 4.2.4.11 Genetic K-Means Clustering
- 4.2.5 Making K-Means Faster
- 4.3 Hierarchical Clustering Algorithms
- 4.4 Discussion and Summary
- Bibliography
- Chapter 5 Density-Based Clustering
-
Chapter 6 Grid-Based Clustering
- 6.1 Introduction
- 6.2 The Classical Algorithms
- 6.3 Adaptive Grid-Based Algorithms
- 6.4 Axis-Shifting Grid-Based Algorithms
- 6.5 High-Dimensional Algorithms
- 6.6 Conclusions and Summary
- Bibliography
- Chapter 7 Nonnegative Matrix Factorizations for Clustering: A Survey
-
Chapter 8 Spectral Clustering
- 8.1 Introduction
- 8.2 Similarity Graph
- 8.3 Unnormalized Spectral Clustering
- 8.4 Normalized Spectral Clustering
- 8.5 Graph Cut View
- 8.6 Random Walks View
- 8.7 Connection to Laplacian Eigenmap
- 8.8 Connection to Kernel k-Means and Nonnegative Matrix Factorization
- 8.9 Large Scale Spectral Clustering
- 8.10 Further Reading
- Bibliography
- Chapter 9 Clustering High-Dimensional Data
-
Chapter 10 A Survey of Stream Clustering Algorithms
- 10.1 Introduction
- 10.2 Methods Based on Partitioning Representatives
- 10.3 Density-Based Stream Clustering
- 10.4 Probabilistic Streaming Algorithms
- 10.5 Clustering High-Dimensional Streams
- 10.6 Clustering Discrete and Categorical Streams
- 10.7 Text Stream Clustering
- 10.8 Other Scenarios for Stream Clustering
- 10.9 Discussion and Conclusions
- Bibliography
- Chapter 11 Big Data Clustering
- Chapter 12 Clustering Categorical Data
- Chapter 13 Document Clustering: The Next Frontier
- Chapter 14 Clustering Multimedia Data
-
Chapter 15 Time-Series Data Clustering
- 15.1 Introduction
- 15.2 The Diverse Formulations for Time-Series Clustering
- 15.3 Online Correlation-Based Clustering
- 15.4 Similarity and Distance Measures
- 15.5 Shape-Based Time-Series Clustering Techniques
- 15.6 Time-Series Clustering Applications
- 15.7 Conclusions
- Bibliography
-
Chapter 16 Clustering Biological Data
- 16.1 Introduction
- 16.2 Clustering Microarray Data
- 16.3 Clustering Biological Networks
- 16.4 Biological Sequence Clustering
- 16.5 Software Packages
- 16.6 Discussion and Summary
- Bibliography
-
Chapter 17 Network Clustering
- 17.1 Introduction
- 17.2 Background and Nomenclature
- 17.3 Problem Definition
- 17.4 Common Evaluation Criteria
- 17.5 Partitioning with Geometric Information
- 17.6 Graph Growing and Greedy Algorithms
- 17.7 Agglomerative and Divisive Clustering
- 17.8 Spectral Clustering
- 17.9 Markov Clustering
- 17.10 Multilevel Partitioning
- 17.11 Local Partitioning Algorithms
- 17.12 Hypergraph Partitioning
- 17.13 Emerging Methods for Partitioning Special Graphs
- 17.14 Conclusion
- Acknowledgments
- Bibliography
-
Chapter 18 A Survey of Uncertain Data Clustering Algorithms
- 18.1 Introduction
- 18.2 Mixture Model Clustering of Uncertain Data
- 18.3 Density-Based Clustering Algorithms
- 18.4 Partitional Clustering Algorithms
- 18.5 Clustering Uncertain Data Streams
- 18.6 Clustering Uncertain Data in High Dimensionality
- 18.7 Clustering with the Possible Worlds Model
- 18.8 Clustering Uncertain Graphs
- 18.9 Conclusions and Summary
- Bibliography
- Chapter 19 Concepts of Visual and Interactive Clustering
-
Chapter 20 Semisupervised Clustering
- 20.1 Introduction
- 20.2 Clustering with Pointwise and Pairwise Semisupervision
- 20.3 Semisupervised Graph Cuts
- 20.4 A Unified View of Label Propagation
- 20.5 Semisupervised Embedding
- 20.6 Comparative Experimental Analysis
- 20.7 Conclusions
- Bibliography
-
Chapter 21 Alternative Clustering Analysis: A Review
- 21.1 Introduction
- 21.2 Technical Preliminaries
- 21.3 Multiple Clustering Analysis Using Alternative Clusterings
- 21.4 Connections to Multiview Clustering and Subspace Clustering
- 21.5 Future Research Issues
- 21.6 Summary
- Bibliography
-
Chapter 22 Cluster Ensembles: Theory and Applications
- 22.1 Introduction
- 22.2 The Cluster Ensemble Problem
- 22.3 Measuring Similarity Between Clustering Solutions
- 22.4 Cluster Ensemble Algorithms
- 22.5 Applications of Consensus Clustering
- 22.6 Concluding Remarks
- Bibliography
-
Chapter 23 Clustering Validation Measures
- 23.1 Introduction
- 23.2 External Clustering Validation Measures
- 23.2.3 Measure Normalization
- 23.2.4 Measure Properties
- 23.3 Internal Clustering Validation Measures
- 23.4 Summary
- Bibliography
- Chapter 24 Educational and Software Resources for Data Clustering
Product information
- Title: Data Clustering
- Author(s):
- Release date: August 2013
- Publisher(s): Chapman and Hall/CRC
- ISBN: 9781466558229
You might also like
book
Data Classification
Research on the problem of classification tends to be fragmented across such areas as pattern recognition, …
book
Clustering Methodology for Symbolic Data
Covers everything readers need to know about clustering methodology for symbolic data—including new methods and headings—while …
video
Clustering and Unsupervised Learning
This course introduces clustering, a common technique used widely in unsupervised machine learning. The course begins …
book
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
Get insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, …