8
Classi cation
First, in classifi cation, on the basis of data whose classes or categories
are known in advance, classifi ers, or mechanisms for correctly assigning
appropriate classes to the data, such as classifi cation rules are learned.
Second, if new data are given, they are classifi ed by using the learned
classifi ers. This chapter will describe the method for construction of such
classifi ers.
8.1 Motivation
When there is a customer application for a credit card, whether to issue a
credit card to the customer or not is an important problem for the credit
card company. This business is called credit business. From the data
related to customers in the past, credit card companies have learned the
rules of decision as to whether to issue a card to a new customer, in other
words, which conditions such a customer should satisfy. To learn the rules
of decision by using data samples of the past and to determine whether
‘yes’ or ‘no’ as to new data or assign appropriate classes to new data, is
classifi cation [Mitchell 1997, Witten et al. 1999, Han et al. 2001, Hand et al.
2001]. In particular, rules which are used to make a decision at this time are
called classifi cation rules. In other words, it is a prerequisite for classifi cation
that a class (or category) to which data should belong is known in advance.
As already described, the clustering task for partitioning data into groups
according to the degree of similarity is signifi cantly different from the
classifi cation task because the characteristics and the names of groups may
be unknown in advance in the former task.
Classifi cation where the result of determination is a continuous numeric
value instead of a discrete value (i.e., class) is especially called prediction
or regression. Prediction will be described in a separate chapter.

Get Social Big Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.