1Introduction to Data Mining

Santosh R. Durugkar1, Rohit Raja2, Kapil Kumar Nagwanshi3* and Sandeep Kumar4

1Amity University Rajasthan, Jaipur, India

2IT Department, GGV Bilaspur Central University, Bilaspur, India

3ASET, Amity University Rajasthan, Jaipur, India

4Computer Science and Engineering Department, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andra Pradesh, India

Abstract

Data mining, as its name suggests “mining”, is nothing but extracting the desired, meaningful exact information from the datasets. Its methods and algorithms help researchers and students develop the numerous applications to be used by the end-users. Its presence in the healthcare industry, marketing, scientific applications, etc., enables the end-users to extract the meaningful required information from the collection. In the initial section, we discuss KDD—knowledge discovery in the database with its different phases like data cleaning, data integration, data selection and transformation, representation. In this chapter, we give a brief introduction to data mining. Comparative discussion about classification and clustering helps the end-user to distinguish these techniques. We also discuss its applications, algorithms, etc. An introduction to a basic clustering algorithm, K-means clustering, hierarchical clustering, fuzzy clustering, and density-based clustering, will help the end-user to select a specific algorithm as per the application. In the last section of this chapter, we introduce ...

Get Data Mining and Machine Learning Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.