First, in classiﬁ cation, on the basis of data whose classes or categories
are known in advance, classiﬁ ers, or mechanisms for correctly assigning
appropriate classes to the data, such as classiﬁ cation rules are learned.
Second, if new data are given, they are classiﬁ ed by using the learned
classiﬁ ers. This chapter will describe the method for construction of such
When there is a customer application for a credit card, whether to issue a
credit card to the customer or not is an important problem for the credit
card company. This business is called credit business. From the data
related to customers in the past, credit card companies have learned the
rules of decision as to whether to issue a card to a new customer, in other
words, which conditions such a customer should satisfy. To learn the rules
of decision by using data samples of the past and to determine whether
‘yes’ or ‘no’ as to new data or assign appropriate classes to new data, is
classiﬁ cation [Mitchell 1997, Witten et al. 1999, Han et al. 2001, Hand et al.
2001]. In particular, rules which are used to make a decision at this time are
called classiﬁ cation rules. In other words, it is a prerequisite for classiﬁ cation
that a class (or category) to which data should belong is known in advance.
As already described, the clustering task for partitioning data into groups
according to the degree of similarity is signiﬁ cantly different from the
classiﬁ cation task because the characteristics and the names of groups may
be unknown in advance in the former task.
Classiﬁ cation where the result of determination is a continuous numeric
value instead of a discrete value (i.e., class) is especially called prediction
or regression. Prediction will be described in a separate chapter.