Book description
Data Mining for Business Intelligence, Second Edition uses real data and actual cases to illustrate the applicability of data mining (DM) intelligence in the development of successful business models. Featuring complimentary access to XLMiner®, the Microsoft Office Excel® add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of DM techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples, now doubled in number in the second edition, are provided to motivate learning and understanding. This book helps readers understand the beneficial relationship that can be established between DM and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions. New topics include detailed coverage of visualization (enhanced by Spotfire subroutines) and time series forecasting, among a host of other subject matter.
Table of contents
- Copyright
- Foreword
- Preface to the Second Edition
- Preface to the First Edition
- Acknowledgments
- I. Preliminaries
-
II. Data Exploration and Dimension Reduction
-
3. Data Visualization
- 3.1. Uses of Data Visualization
- 3.2. Data Examples
- 3.3. Basic Charts: bar charts, line graphs, and scatterplots
-
3.4. Multidimensional Visualization
- 3.4.1. Adding Variables: Color, Size, Shape, Multiple Panels, and Animation
- 3.4.2. Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, and Panning, and Filtering
- 3.4.3. Reference: Trend Lines and Labels
- 3.4.4. Scaling up: Large Datasets
- 3.4.5. Multivariate Plot: Parallel Coordinates Plot
- 3.4.6. Interactive Visualization
- 3.5. Specialized Visualizations
- 3.6. Summary of major visualizations and operations, according to data mining goal
- 3.7. PROBLEMS
-
4. Dimension Reduction
- 4.1. Introduction
- 4.2. Practical Considerations
- 4.3. Data Summaries
- 4.4. Correlation Analysis
- 4.5. Reducing the Number of Categories in Categorical Variables
- 4.6. Converting A Categorical Variable to A Numerical Variable
- 4.7. Principal Components Analysis
- 4.8. Dimension Reduction Using Regression Models
- 4.9. Dimension Reduction Using Classification and Regression Trees
- 4.10. PROBLEMS
-
3. Data Visualization
-
III. Performance Evaluation
-
5. Evaluating Classification and Predictive Performance
- 5.1. Introduction
-
5.2. Judging Classification Performance
- 5.2.1. Benchmark: The Naive Rule
- 5.2.2. Class Separation
- 5.2.3. Classification Matrix
- 5.2.4. Using the Validation Data
- 5.2.5. Accuracy Measures
- 5.2.6. Cutoff for Classification
- 5.2.7. Performance in Unequal Importance of Classes
- 5.2.8. Asymmetric Misclassification Costs
- 5.2.9. Oversampling and Asymmetric Costs
- 5.2.10. Classification Using a Triage Strategy
- 5.3. Evaluating Predictive Performance
- 5.4. PROBLEMS
-
5. Evaluating Classification and Predictive Performance
-
IV. Prediction and Classification Methods
- 6. Multiple Linear Regression
- 7. k-Nearest Neighbors (k-NN)
- 8. Naive Bayes
-
9. Classification and Regression Trees
- 9.1. Introduction
- 9.2. Classification Trees
- 9.3. Measures of Impurity
- 9.4. Evaluating the Performance of a Classification Tree
- 9.5. Avoiding Overfitting
- 9.6. Classification Rules from Trees
- 9.7. Classification Trees for More Than two Classes
- 9.8. Regression Trees
- 9.9. Advantages, Weaknesses, and Extensions
- 9.10. PROBLEMS
- 10. Logistic Regression
- 11. Neural Nets
-
12. Discriminant Analysis
- 12.1. Introduction
- 12.2. Distance of an Observation from a Class
- 12.3. Fisher's Linear Classification Functions
- 12.4. Classification Performance of Discriminant Analysis
- 12.5. Prior Probabilities
- 12.6. Unequal Misclassification Costs
- 12.7. Classifying More Than Two Classes
- 12.8. Advantages and Weaknesses
- 12.9. PROBLEMS
-
V. Mining Relationships Among Records
- 13. Association Rules
-
14. Cluster Analysis
- 14.1. Introduction
- 14.2. Measuring Distance Between Two Records
- 14.3. Measuring Distance Between Two Clusters
-
14.4. Hierarchical (Agglomerative) Clustering
- 14.4.1. Minimum Distance (Single Linkage)
- 14.4.2. Maximum Distance (Complete Linkage)
- 14.4.3. Average Distance (Average Linkage)
- 14.4.4. Centroid Distance (Average Group Linkage)
- 14.4.5. Ward's Method
- 14.4.6. Dendrograms: Displaying Clustering Process and Results
- 14.4.7. Validating Clusters
- 14.4.8. Limitations of Hierarchical Clustering
- 14.5. Nonhierarchical Clustering: The k-Means Algorithm
- 14.6. PROBLEMS
-
VI. Forecasting Time Series
- 15. Handling Time Series
- 16. Regression-Based Forecasting
- 17. Smoothing Methods
- VII. Cases
-
References
Product information
- Title: Data Mining For Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel® with XLMiner®, Second Edition
- Author(s):
- Release date: October 2010
- Publisher(s): Wiley
- ISBN: 9780470526828
You might also like
book
Visual Intelligence: Microsoft Tools and Techniques for Visualizing Data
Go beyond design concepts and learn to build state-of-the-art visualizations The visualization experts at Microsoft's Pragmatic …
book
Data Mining Models, Second Edition
Data mining has become the fastest growing topic of interest in business programs in the past …
book
Business Analytics Using R - A Practical Approach
Learn the fundamental aspects of the business statistics, data mining, and machine learning techniques required to …
video
Fighting Churn Churn Analysis: Identifying churned customers
Analyze a dataset of customer metrics to uncover patterns and behaviors that indicate customer churn.