Book description
Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, this book explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data. It presents core methods in data classification, covers recent problem domains, and discusses advanced methods for enhancing the quality of the underlying classification results.
Table of contents
- Preliminaries
- Series
- Dedication
- Editor Biography
- Contributors
- Preface
-
Chapter 1 An Introduction to Data Classification
- 1.1 Introduction
- 1.2 Common Techniques in Data Classification
- 1.3 Handing Different Data Types
- 1.4 Variations on Data Classification
- 1.5 Discussion and Conclusions
- Bibliography
- Chapter 2 Feature Selection for Classification: A Review
- Chapter 3 Probabilistic Models for Classification
- Chapter 4 Decision Trees: Theory and Algorithms
-
Chapter 5 Rule-Based Classification
- 5.1 Introduction
- 5.2 Rule Induction
- 5.3 Classification Based on Association Rule Mining
- 5.4 Applications
- 5.5 Discussion and Conclusion
- Bibliography
-
Chapter 6 Instance-Based Learning: A Survey
- 6.1 Introduction
- 6.2 Instance-Based Learning Framework
- 6.3 The Nearest Neighbor Classifier
- 6.4 Lazy SVM Classification
- 6.5 Locally Weighted Regression
- 6.6 Lazy Naive Bayes
- 6.7 Lazy Decision Trees
- 6.8 Rule-Based Classification
- 6.9 Radial Basis Function Networks: Leveraging Neural Networks for Instance-Based Learning
- 6.10 Lazy Methods for Diagnostic and Visual Classification
- 6.11 Conclusions and Summary
- Bibliography
- Chapter 7 Support Vector Machines
-
Chapter 8 Neural Networks: A Review
- 8.1 Introduction
- 8.2 Fundamental Concepts
- 8.3 Single-Layer Neural Network
- 8.4 Kernel Neural Network
- 8.5 Multi-Layer Feedforward Network
- 8.6 Deep Neural Networks
- 8.7 Summary
- Acknowledgements
- Bibliography
- Chapter 9 A Survey of Stream Classification Algorithms
- Chapter 10 Big Data Classification
-
Chapter 11 Text Classification
- 11.1 Introduction
-
11.2 Feature Selection for Text Classification
- 11.2.1 Gini Index
- 11.2.2 Information Gain
- 11.2.3 Mutual Information
- 11.2.4 χ2-Statistic
- 11.2.5 Feature Transformation Methods: Unsupervised and Supervised LSI
- 11.2.6 Supervised Clustering for Dimensionality Reduction
- 11.2.7 Linear Discriminant Analysis
- 11.2.8 Generalized Singular Value Decomposition
- 11.2.9 Interaction of Feature Selection with Classification
- 11.3 Decision Tree Classifiers
- 11.4 Rule-Based Classifiers
- 11.5 Probabilistic and Naive Bayes Classifiers
- 11.6 Linear Classifiers
- 11.7 Proximity-Based Classifiers
- 11.8 Classification of Linked and Web Data
- 11.9 Meta-Algorithms for Text Classification
- 11.10 Leveraging Additional Training Data
- 11.11 Conclusions and Summary
- Bibliography
- Chapter 12 Multimedia Classification
- Chapter 13 Time Series Data Classification
- Chapter 14 Discrete Sequence Classification
- Chapter 15 Collective Classification of Network Data
- Chapter 16 Uncertain Data Classification
-
Chapter 17 Rare Class Learning
- 17.1 Introduction
- 17.2 Rare Class Detection
- 17.3 The Semi-Supervised Scenario: Positive and Unlabeled Data
- 17.4 The Semi-Supervised Scenario: Novel Class Detection
- 17.5 Human Supervision
- 17.6 Other Work
- 17.7 Conclusions and Summary
- Bibliography
-
Chapter 18 Distance Metric Learning for Data Classification
- 18.1 Introduction
- 18.2 The Definition of Distance Metric Learning
-
18.3 Supervised Distance Metric Learning Algorithms
- 18.3.1 Linear Discriminant Analysis (LDA)
- 18.3.2 Margin Maximizing Discriminant Analysis (MMDA)
- 18.3.3 Learning with Side Information (LSI)
- 18.3.4 Relevant Component Analysis (RCA)
- 18.3.5 Information Theoretic Metric Learning (ITML)
- 18.3.6 Neighborhood Component Analysis (NCA)
- 18.3.7 Average Neighborhood Margin Maximization (ANMM)
- 18.3.8 Large Margin Nearest Neighbor Classifier (LMNN)
- 18.4 Advanced Topics
- 18.5 Conclusions and Discussions
- Bibliography
- Chapter 19 Ensemble Learning
- Chapter 20 Semi-Supervised Learning
-
Chapter 21 Transfer Learning
- 21.1 Introduction
- 21.2 Transfer Learning Overview
- 21.3 Homogenous Transfer Learning
- 21.4 Heterogeneous Transfer Learning
- 21.5 Transfer Bounds and Negative Transfer
- 21.6 Other Research Issues
- 21.7 Applications of Transfer Learning
- 21.8 Concluding Remarks
- Bibliography
-
Chapter 22 Active Learning: A Survey
- 22.1 Introduction
- 22.2 Motivation and Comparisons to Other Strategies
- 22.3 Querying Strategies
- 22.4 Active Learning with Theoretical Guarantees
- 22.5 Dependency-Oriented Data Types for Active Learning
-
22.6 Advanced Methods
- 22.6.1 Active Learning of Features
- 22.6.2 Active Learning of Kernels
- 22.6.3 Active Learning of Classes
- 22.6.4 Streaming Active Learning
- 22.6.5 Multi-Instance Active Learning
- 22.6.6 Multi-Label Active Learning
- 22.6.7 Multi-Task Active Learning
- 22.6.8 Multi-View Active Learning
- 22.6.9 Multi-Oracle Active Learning
- 22.6.10 Multi-Objective Active Learning
- 22.6.11 Variable Labeling Costs
- 22.6.12 Active Transfer Learning
- 22.6.13 Active Reinforcement Learning
- 22.7 Conclusions
- Bibliography
- Chapter 23 Visual Classification
- Chapter 24 Evaluation of Classification Methods
- Chapter 25 Educational and Software Resources for Data Classification
Product information
- Title: Data Classification
- Author(s):
- Release date: July 2014
- Publisher(s): Chapman and Hall/CRC
- ISBN: 9781498760584
You might also like
book
Data Clustering
Research on the problem of clustering tends to be fragmented across the pattern recognition, database, data …
book
Training Data for Machine Learning
Your training data has as much to do with the success of your data project as …
book
Hands-On Healthcare Data
Healthcare is the next frontier for data science. Using the latest in machine learning, deep learning, …
book
Data Mesh
We're at an inflection point in data, where our data management solutions no longer match the …