Book description
Written by leaders in the data mining community, including the developers of the RapidMiner software, this book provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors. It presents the most powerful and flexible open source software solutions: RapidMiner and RapidAnalytics. The book and software tools cover all relevant steps of the data mining process. The software and their extensions can be freely downloaded at www.RapidMiner.com.
Table of contents
- Preliminareis
- Series
- Dedication
- Foreword
-
Preface
- What Is Data Mining? What Is It Good for, What Are Its Applications, and What Does It Enable Me to Do?
- Why Should I Read This Book? Why Case Studies? What Will I Learn? What Will I Be Able to Achieve?
- What Are the Advantages of the Open Source Solutions RapidMiner and RapidAnalytics Used in This Book?
- What Is the Structure of This Book and Which Chapters Should I Read?
- About the Editors
- List of Contributors
- Acknowledgments
- Part I Introduction to Data Mining and RapidMiner
-
Part II Basic Classification Use Cases for Credit Approval and in Education
- Chapter 3 k-Nearest Neighbor Classification I
- Chapter 4 k-Nearest Neighbor Classification II
- Chapter 5 Naïve Bayes Classification I
- Chapter 6 Naïve Bayes Classification II
-
Part III Marketing, Cross-Selling, and Recommender System Use Cases
- Chapter 7 Who Wants My Product? Affinity-Based Marketing
-
Chapter 8 Basic Association Rule Mining in RapidMiner
- 8.1 Data Mining Case Study
-
- Figure 8.1
- Figure 8.2
- Figure 8.3
- Figure 8.4
- Figure 8.5
- Figure 8.6
- Figure 8.7
- Figure 8.8
- Figure 8.9
- Figure 8.10
- Figure 8.11
- Figure 8.12
- Figure 8.13
- Figure 8.14
- Figure 8.15
- Figure 8.16
- Figure 8.17
- Figure 8.18
- Figure 8.19
- Figure 8.20
- Figure 8.21
- Figure 8.22
- Figure 8.23
- Figure 8.24
- Figure 8.25
- Figure 8.26
- Figure 8.27
- Figure 8.28
- Figure 8.29
- Figure 8.30
- Figure 8.31
- Figure 8.32
- Figure 8.33
- Figure 8.34
- Figure 8.35
-
Chapter 9 Constructing Recommender Systems in RapidMiner
- Acronyms
- 9.1 Introduction
- 9.2 The Recommender Extension
- 9.3 The VideoLectures.net Dataset
- 9.4 Collaborative-based Systems
- 9.5 Content-based Recommendation
- 9.6 Hybrid Recommender Systems
- 9.7 Providing RapidMiner Recommender System Workflows as Web Services Using RapidAnalytics
- 9.8 Summary
- Glossary
- Bibliography
- Chapter 10 Recommender System for Selection of the Right Study Program for Higher Education Students
-
Part IV Clustering in Medical and Educational Domains
- Chapter 11 Visualising Clustering Validity Measures
- Chapter 12 Grouping Higher Education Students with RapidMiner
-
Part V Text Mining: Spam Detection, Language Detection, and Customer Feedback Analysis
-
Chapter 13 Detecting Text Message Spam
- Acronyms
- 13.1 Overview
- 13.2 Applying This Technique in Other Domains
- 13.3 Installing the Text Processing Extension
- 13.4 Getting the Data
- 13.5 Loading the Text
- 13.6 Examining the Text
- 13.7 Processing the Text for Classification
- 13.8 The Naïve Bayes Algorithm
- 13.9 Classifying the Data as Spam or Ham
- 13.10 Validating the Model
- 13.11 Applying the Model to New Data
- 13.12 Improvements
- 13.13 Summary
- Chapter 14 Robust Language Identification with RapidMiner: A Text Mining Use Case
- Chapter 15 Text Mining with RapidMiner
-
Chapter 13 Detecting Text Message Spam
-
Part VI Feature Selection and Classification in Astroparticle Physics and in Medical Domains
- Chapter 16 Application of RapidMiner in Neutrino Astronomy
-
Chapter 17 Medical Data Mining
- 17.1 Background
- 17.2 Description of Problem Domain: Two Medical Examples
- 17.3 Data Mining Algorithms in Medicine
-
17.4 Knowledge Discovery Process in RapidMiner: Carpal Tunnel Syndrome
- 17.4.1 Defining the Problem, Setting the Goals
- 17.4.2 Dataset Representation
- 17.4.3 Data Preparation
- 17.4.4 Modeling
- 17.4.5 Selecting Appropriate Methods for Classification
- 17.4.6 Results and Data Visualisation
- 17.4.7 Interpretation of the Results
- 17.4.8 Hypothesis Testing and Statistical Analysis
- 17.4.9 Results and Visualisation
- 17.5 Knowledge Discovery Process in RapidMiner: Diabetes
- 17.6 Specifics in Medical Data Mining
- 17.7 Summary
- Bibliography
-
- Figure 17.1
- Figure 17.2
- Figure 17.3
- Figure 17.4
- Figure 17.5
- Figure 17.6
- Figure 17.7
- Figure 17.8
- Figure 17.9
- Figure 17.10
- Figure 17.11
- Figure 17.12
- Figure 17.13
- Figure 17.14
- Figure 17.15
- Figure 17.16
- Figure 17.17
- Figure 17.18
- Figure 17.19
- Figure 17.20
- Figure 17.21
- Figure 17.22
- Figure 17.23
- Figure 17.24
-
Part VII Molecular Structure- and Property-Activity Relationship Modeling in Biochemistry and Medicine
-
Chapter 18 Using PaDEL to Calculate Molecular Properties and Chemoinformatic Models
- 18.1 Introduction
- 18.2 Molecular Structure Formats for Chemoinformatics
- 18.3 Installation of the PaDEL Extension for RapidMiner
- 18.4 Applications and Capabilities of the PaDEL Extension
- 18.5 Examples of Computer-aided Predictions
- 18.6 Calculation of Molecular Properties
- 18.7 Generation of a Linear Regression Model
- 18.8 Example Workflow
- 18.9 Summary
- Acknowledgment
- Bibliography
- Chapter 19 Chemoinformatics: Structure- and Property-activity Relationship Development
-
Chapter 18 Using PaDEL to Calculate Molecular Properties and Chemoinformatic Models
-
Part VIII Image Mining: Feature Extraction, Segmentation, and Classification
- Chapter 20 Image Mining Extension for RapidMiner (Introductory)
- Chapter 21 Image Mining Extension for RapidMiner (Advanced)
-
Part IX Anomaly Detection, Instance Selection, and Prototype Construction
- Chapter 22 Instance Selection in RapidMiner
-
Chapter 23 Anomaly Detection
- Acronyms
- 23.1 Introduction
- 23.2 Categorizing an Anomaly Detection Problem
- 23.3 A Simple Artificial Unsupervised Anomaly Detection Example
-
23.4 Unsupervised Anomaly Detection Algorithms
- 23.4.1 k-NN Global Anomaly Score
- 23.4.2 Local Outlier Factor (LOF)
- 23.4.3 Connectivity-Based Outlier Factor (COF)
- 23.4.4 Influenced Outlierness (INFLO)
- 23.4.5 Local Outlier Probability (LoOP)
- 23.4.6 Local Correlation Integral (LOCI) and aLOCI
- 23.4.7 Cluster-Based Local Outlier Factor (CBLOF)
- 23.4.8 Local Density Cluster-Based Outlier Factor (LDCOF)
- 23.5 An Advanced Unsupervised Anomaly Detection Example
- 23.6 Semi-supervised Anomaly Detection
- 23.7 Summary
- Glossary
- Bibliography
-
Part X Meta-Learning, Automated Learner Selection, Feature Selection, and Parameter Optimization
-
Chapter 24 Using RapidMiner for Research: Experimental Evaluation of Learners
- 24.1 Introduction
- 24.2 Research of Learning Algorithms
-
24.3 Experimental Evaluation in RapidMiner
- 24.3.1 Setting Up the Evaluation Scheme
- 24.3.2 Looping Through a Collection of Datasets
- 24.3.3 Looping Through a Collection of Learning Algorithms
- 24.3.4 Logging and Visualizing the Results
- 24.3.5 Statistical Analysis of the Results
- 24.3.6 Exception Handling and Parallelization
- 24.3.7 Setup for Meta-Learning
- 24.4 Conclusions
- Bibliography
-
Chapter 24 Using RapidMiner for Research: Experimental Evaluation of Learners
Product information
- Title: RapidMiner
- Author(s):
- Release date: April 2016
- Publisher(s): Chapman and Hall/CRC
- ISBN: 9781498759861
You might also like
book
Biomolecular and Bioanalytical Techniques
An essential guide to biomolecular and bioanalytical techniques and their applications Biomolecular and Bioanalytical Techniques offers …
book
Time Smart
There's an 80 percent chance you're poor. Time poor, that is. Four out of five adults …
article
Run Llama-2 Models Locally with llama.cpp
Llama is Meta’s answer to the growing demand for LLMs. Unlike its well-known technological relative, ChatGPT, …
article
Reinventing the Organization for GenAI and LLMs
Previous technology breakthroughs did not upend organizational structure, but generative AI and LLMs will. We now …