Chapter 1

An Introduction to Data Classification

Charu C. Aggarwal

IBM T. J. Watson Research CenterYorktown Heights, NY charu@us.ibm.com

1.1 Introduction

The problem of data classification has numerous applications in a wide variety of mining applications. This is because the problem attempts to learn the relationship between a set of feature variables and a target variable of interest. Since many practical problems can be expressed as associations between feature and target variables, this provides a broad range of applicability of this model. The problem of classification may be stated as follows:

Given a set of training data points along with associated training labels, determine the class label for an unlabeled test instance.

Numerous variations ...

Get Data Classification now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.