O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Practical Applications of Data Mining

Book Description

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems. Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Table of Contents (1/2)
  5. Table of Contents (2/2)
  6. Preface (1/2)
  7. Preface (2/2)
  8. Foreword
  9. Foreword
  10. Chapter 1: Introduction to Data Mining
  11. 1.1 Traditional Database Management Systems
  12. 1.2 Knowledge Discovery in Databases
    1. 1.2.1 Pre-Processing
    2. 1.2.2 Data Warehousing
    3. 1.2.3 Post-Processing
  13. 1.3 Data-Mining Methods
    1. 1.3.1 Association Rules
    2. 1.3.2 Classification Learning
    3. 1.3.3 Statistical Data Mining
    4. 1.3.4 Rough Sets for Data Mining
    5. 1.3.5 Neural Networks for Data Mining
    6. 1.3.6 Clustering for Data Mining
    7. 1.3.7 Fuzzy Sets for Data Mining
  14. 1.4 Integrated Framework for Intelligent Databases
  15. 1.5 Practical Applications of Data Mining
    1. 1.5.1 Healthcare Services
    2. 1.5.2 Banking
    3. 1.5.3 Supermarket Applications
    4. 1.5.4 Medical Image Classification
  16. 1.6 Chapter Summary
  17. Chapter 2: Association Rules
  18. 2.1 Introduction
  19. 2.2 Mining of Association Rules in Market Basket Data
    1. 2.2.1 Apriori Algorithm
    2. 2.2.2 Apriori-gen( ) Function
    3. 2.2.3 Apriori Example
    4. 2.2.4 AprioriTid Algorithm
  20. 2.3 Attribute-Oriented Rule Generalization
    1. 2.3.1 Concept Hierarchies
    2. 2.3.2 Basic Strategies for Attribute-Oriented Induction
    3. 2.3.3 Basic Attribute-Oriented Induction Algorithm
    4. 2.3.4 Generation of Discrimination Rules through Attribute-Oriented Induction
  21. 2.4 Association Rules in Hypertext Databases
    1. 2.4.1 Formal Model
    2. 2.4.2 Algorithms for Generating Composite Association Rules
  22. 2.5 Quantitative Association Rules
    1. 2.5.1 Mapping of Quantitative Association Rules
    2. 2.5.2 Problem Decomposition
    3. 2.5.3 Partitioning of Quantitative Attributes
  23. 2.6 Mining of Compact Rules
    1. 2.6.1 Semantic Association Relationships
    2. 2.6.2 Generalization Algorithm
    3. 2.6.3 Learning Process
    4. 2.6.4 Learning Algorithm
  24. 2.7 Mining of Time-Constrained Association Rules
    1. 2.7.1 Time-Constrained Association Rules
    2. 2.7.2 Properties of Time Constraints
    3. 2.7.3 Potential Applications
  25. 2.8 Chapter Summary
  26. 2.9 Exercises
  27. 2.10 Selected Bibliographic Notes
  28. 2.11 Chapter Bibliography
  29. Chapter 3: Classification Learning
  30. 3.1 Introduction
  31. 3.2 Knowledge Representation
    1. 3.2.1 Classification Rules
    2. 3.2.2 Decision Trees
  32. 3.3 Separate-and-Conquer Approach
    1. 3.3.1 Prism
    2. 3.3.2 Induct (1/3)
    3. 3.3.2 Induct (2/3)
    4. 3.3.2 Induct (3/3)
    5. 3.3.3 REP, IREP, RIPPER
  33. 3.4 Divide-and-Conquer Approach
    1. 3.4.1 ID3 (1/2)
    2. 3.4.1 ID3 (2/2)
    3. 3.4.2 C4.5 and C5.0 (1/4)
    4. 3.4.2 C4.5 and C5.0 (2/4)
    5. 3.4.2 C4.5 and C5.0 (3/4)
    6. 3.4.2 C4.5 and C5.0 (4/4)
  34. 3.5 Partial Decision Tree (1/2)
  35. 3.5 Partial Decision Tree (2/2)
  36. 3.6 Chapter Summary
  37. 3.7 Exercises
  38. 3.8 Selected Bibliographic Notes
  39. 3.9 Chapter Bibliography
  40. Chapter 4: Statistics for Data Mining
  41. 4.1 Introduction
  42. 4.2 House Sales Data
  43. 4.3 Conditional Probability
  44. 4.4 Equality Tests
  45. 4.5 Correlation Coefficient
  46. 4.6 Contingency Table and the x Test (1/2)
  47. 4.6 Contingency Table and the x Test (2/2)
  48. 4.7 Linear Regression (1/2)
  49. 4.7 Linear Regression (2/2)
  50. 4.8 House Sales Database Revisited
  51. 4.9 Chapter Summary
  52. 4.10 Exercises
  53. 4.11 Selected Bibliographic Notes
  54. 4.12 Chapter Bibliography
  55. Chapter 5: Rough Sets and Bayes’ Theories
  56. 5.1 Introduction
  57. 5.2 Bayes’ Theorem
  58. 5.3 Rough Sets
    1. 5.3.1 Data Analysis and Representation
    2. 5.3.2 Reduction of Condition Attributes and Generation of Decision Rules
  59. 5.4 Applications Based on Bayes’ and Rough Sets
    1. 5.4.1 Customer Tendency Analysis Using Bayes’ Theory
    2. 5.4.2 Contact Lens Prescription Using Rough Set Theory
    3. 5.4.3 Welding Procedure Using Rough-Set Theory (1/2)
    4. 5.4.3 Welding Procedure Using Rough-Set Theory (2/2)
    5. 5.4.4 Classification of Automobiles Using Both Bayes’ and Rough Set Theory (1/2)
    6. 5.4.4 Classification of Automobiles Using Both Bayes’ and Rough Set Theory (2/2)
  60. 5.5 Chapter Summary
  61. 5.6 Exercises (1/2)
  62. 5.6 Exercises (2/2)
  63. 5.7 Selected Bibliographic Notes
  64. 5.8 Chapter Bibliography
  65. Chapter 6: Neural Networks
  66. 6.1 Introduction
  67. 6.2 Neural Computing and Databases
  68. 6.3 Network Classification
    1. 6.3.1 Unsupervised Learning Models
    2. 6.3.2 Supervised Learning Models
  69. 6.4 Parameters of the Learning Process
    1. 6.4.1 Number of Hidden Layers
    2. 6.4.2 Number of Hidden Nodes
    3. 6.4.3 Early Stopping
    4. 6.4.4 Convergence Curve (Back-Propagation Neural Network)
  70. 6.5 Network Structures
    1. 6.5.1 Neural Net and Traditional Classifiers
  71. 6.6 Knowledge Discovery in Databases
    1. 6.6.1 Normalization
  72. 6.7 Backpropagation Neural Network (BPNN) Model
    1. 6.7.1 Network Architecture
    2. 6.7.2 Algorithm
    3. 6.7.3 Example I
    4. 6.7.4 Example II (Retrieval of Data Using the BPNN Model)
  73. 6.8 Bidirectional Associative Memory (BAM) Model
    1. 6.8.1 Network Architecture
    2. 6.8.2 Algorithm
    3. 6.8.3 Example with Four Training Vectors
  74. 6.9 Learning Vector Quantization (LVQ) Model
    1. 6.9.1 Network Architecture
    2. 6.9.2 Algorithm
    3. 6.9.3 Example
  75. 6.10 Probabilistic Neural Network (PNN) Model
    1. 6.10.1 Network Architecture
    2. 6.10.2 Algorithm
    3. 6.10.3 Example
    4. 6.10.4 Parameter Adjustment Using a Smoothing Factor
  76. 6.11 Chapter Summary
  77. 6.12 Exercises (1/2)
  78. 6.12 Exercises (2/2)
  79. 6.13 Selected Bibliographic Notes
  80. 6.14 Chapter Bibliography
  81. Chapter 7: Clustering
  82. 7.1 Introduction
  83. 7.2 Definition of Clusters and Clustering
  84. 7.3 Clustering Procedures
  85. 7.4 Clustering Concepts
    1. 7.4.1 Choosing Variables
    2. 7.4.2 Similarity and Dissimilarity Measurement
    3. 7.4.3 Standardization of Variables
    4. 7.4.4 Weights and Threshold Values
    5. 7.4.5 Association Rules
  86. 7.5 Clustering Algorithms
    1. 7.5.1 Hierarchical Algorithms (1/3)
    2. 7.5.1 Hierarchical Algorithms (2/3)
    3. 7.5.1 Hierarchical Algorithms (3/3)
    4. 7.5.2 Graph Theory Algorithm with the Single-link Method
    5. 7.5.3 Partition Algorithms: K-means Algorithm
    6. 7.5.4 Density-Search Algorithms
    7. 7.5.5 Association Rule Algorithms (1/4)
    8. 7.5.5 Association Rule Algorithms (2/4)
    9. 7.5.5 Association Rule Algorithms (3/4)
    10. 7.5.5 Association Rule Algorithms (4/4)
  87. 7.6 Chapter Summary
  88. 7.7 Exercises
  89. 7.8 Selected Bibliographic Notes
  90. 7.9 Chapter Bibliography
  91. Chapter 8: Fuzzy Information Retrieval
  92. 8.1 Introduction
  93. 8.2 Fuzzy Set Basics
  94. 8.3 Fuzzy Set Applications
    1. 8.3.1 Project Management
    2. 8.3.2 Data Analysis
    3. 8.3.3 Nuanced Information Systems
  95. 8.4 Linguistic Variables
  96. 8.5 Fuzzy Query Processing (1/3)
  97. 8.5 Fuzzy Query Processing (2/3)
  98. 8.5 Fuzzy Query Processing (3/3)
  99. 8.6 Fuzzy Query Processing Using Fuzzy Tables
    1. 8.6.1 Convert Raw Data to Fuzzy Member Functions
    2. 8.6.2 Fuzzy Table
    3. 8.6.3 Fuzzy Search Engine
    4. 8.6.4 Fuzzy Table Construction
    5. 8.6.5 Fuzzy Query Processing
  100. 8.7 Role of Relational Division for Information Retrieval
    1. 8.7.1 Information Retrieval through Relational Division
    2. 8.7.2 Information Retrieval through Fuzzy Relational Division
  101. 8.8 Alpha-Cut Thresholds
  102. 8.9 Chapter Summary
  103. 8.10 Exercises (1/2)
  104. 8.10 Exercises (2/2)
  105. 8.11 Selected Bibliographic Notes
  106. 8.12 Chapter Bibliography
  107. Appendix (1/3)
  108. Appendix (2/3)
  109. Appendix (3/3)
  110. Index (1/2)
  111. Index (2/2)