Data Science, 2nd Edition

Book description

Learn the basics of Data Science through an easy to understand conceptual framework and immediately practice using RapidMiner platform. Whether you are brand new to data science or working on your tenth project, this book will show you how to analyze data, uncover hidden patterns and relationships to aid important decisions and predictions.

Data Science has become an essential tool to extract value from data for any organization that collects, stores and processes data as part of its operations. This book is ideal for business users, data analysts, business analysts, engineers, and analytics professionals and for anyone who works with data.

You’ll be able to:

  1. Gain the necessary knowledge of different data science techniques to extract value from data.
  2. Master the concepts and inner workings of 30 commonly used powerful data science algorithms.
  3. Implement step-by-step data science process using using RapidMiner, an open source GUI based data science platform

Data Science techniques covered: Exploratory data analysis, Visualization, Decision trees, Rule induction, k-nearest neighbors, Naïve Bayesian classifiers, Artificial neural networks, Deep learning, Support vector machines, Ensemble models, Random forests, Regression, Recommendation engines, Association analysis, K-Means and Density based clustering, Self organizing maps, Text mining, Time series forecasting, Anomaly detection, Feature selection and more...

  • Contains fully updated content on data science, including tactics on how to mine business data for information
  • Presents simple explanations for over twenty powerful data science techniques
  • Enables the practical use of data science algorithms without the need for programming
  • Demonstrates processes with practical use cases
  • Introduces each algorithm or technique and explains the workings of a data science algorithm in plain language
  • Describes the commonly used setup options for the open source tool RapidMiner

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. Foreword
  7. Preface
    1. Why Data Science?
    2. Why This Book?
    3. Who Can Use This Book?
  8. Acknowledgments
  9. Chapter 1. Introduction
    1. Abstract
    2. 1.1 AI, Machine learning, and Data Science
    3. 1.2 What is Data Science?
    4. 1.3 Case for Data Science
    5. 1.4 Data Science Classification
    6. 1.5 Data Science Algorithms
    7. 1.6 Roadmap for This Book
    8. References
  10. Chapter 2. Data Science Process
    1. Abstract
    2. 2.1 Prior Knowledge
    3. 2.2 Data Preparation
    4. 2.3 Modeling
    5. 2.4 Application
    6. 2.5 Knowledge
    7. References
  11. Chapter 3. Data Exploration
    1. Abstract
    2. 3.1 Objectives of Data Exploration
    3. 3.2 Datasets
    4. 3.3 Descriptive Statistics
    5. 3.4 Data Visualization
    6. 3.5 Roadmap for Data Exploration
    7. References
  12. Chapter 4. Classification
    1. Abstract
    2. 4.1 Decision Trees
    3. 4.2 Rule Induction
    4. 4.3 k-Nearest Neighbors
    5. 4.4 Naïve Bayesian
    6. 4.5 Artificial Neural Networks
    7. 4.6 Support Vector Machines
    8. 4.7 Ensemble Learners
    9. References
  13. Chapter 5. Regression Methods
    1. Abstract
    2. 5.1 Linear Regression
    3. 5.2 Logistic Regression
    4. 5.3 Conclusion
    5. References
  14. Chapter 6. Association Analysis
    1. Abstract
    2. 6.1 Mining Association Rules
    3. 6.2 Apriori Algorithm
    4. 6.3 Frequent Pattern-Growth Algorithm
    5. 6.4 Conclusion
    6. References
  15. Chapter 7. Clustering
    1. Abstract
    2. Clustering to Describe the Data
    3. Clustering for Preprocessing
    4. Types of Clustering Techniques
    5. 7.1 k-Means Clustering
    6. 7.2 DBSCAN Clustering
    7. 7.3 Self-Organizing Maps
    8. References
  16. Chapter 8. Model Evaluation
    1. Abstract
    2. 8.1 Confusion Matrix
    3. 8.2 ROC and AUC
    4. 8.3 Lift Curves
    5. 8.4 How to Implement
    6. 8.5 Conclusion
    7. References
  17. Chapter 9. Text Mining
    1. Abstract
    2. 9.1 How It Works
    3. 9.2 How to Implement
    4. 9.3 Conclusion
    5. References
  18. Chapter 10. Deep Learning
    1. Abstract
    2. 10.1 The AI Winter
    3. 10.2 How it works
    4. 10.3 How to Implement
    5. 10.4 Conclusion
    6. References
  19. Chapter 11. Recommendation Engines
    1. Abstract
    2. Why Do We Need Recommendation Engines?
    3. Applications of Recommendation Engines
    4. 11.1 Recommendation Engine Concepts
    5. 11.2 Collaborative Filtering
    6. 11.3 Content-Based Filtering
    7. 11.4 Hybrid Recommenders
    8. 11.5 Conclusion
    9. References
  20. Chapter 12. Time Series Forecasting
    1. Abstract
    2. Taxonomy of Time Series Forecasting
    3. 12.1 Time Series Decomposition
    4. 12.2 Smoothing Based Methods
    5. 12.3 Regression Based Methods
    6. 12.4 Machine Learning Methods
    7. 12.5 Performance Evaluation
    8. 12.6 Conclusion
    9. References
  21. Chapter 13. Anomaly Detection
    1. Abstract
    2. 13.1 Concepts
    3. 13.2 Distance-Based Outlier Detection
    4. 13.3 Density-Based Outlier Detection
    5. 13.4 Local Outlier Factor
    6. 13.5 Conclusion
    7. References
  22. Chapter 14. Feature Selection
    1. Abstract
    2. 14.1 Classifying Feature Selection Methods
    3. 14.2 Principal Component Analysis
    4. 14.3 Information Theory-Based Filtering
    5. 14.4 Chi-Square-Based Filtering
    6. 14.5 Wrapper-Type Feature Selection
    7. 14.6 Conclusion
    8. References
  23. Chapter 15. Getting Started with RapidMiner
    1. Abstract
    2. 15.1 User Interface and Terminology
    3. 15.2 Data Importing and Exporting Tools
    4. 15.3 Data Visualization Tools
    5. 15.4 Data Transformation Tools
    6. 15.5 Sampling and Missing Value Tools
    7. 15.6 Optimization Tools
    8. 15.7 Integration with R
    9. 15.8 Conclusion
    10. References
  24. Comparison of Data Science Algorithms
  25. About the Authors
  26. Index
  27. Praise

Product information

  • Title: Data Science, 2nd Edition
  • Author(s): Vijay Kotu, Bala Deshpande
  • Release date: November 2018
  • Publisher(s): Morgan Kaufmann
  • ISBN: 9780128147627