306 ◾ Kweku-Muata Osei-Bryson
Abstract: e decision tree (DT) induction process has two major phases: the growth
phase and the pruning phase. e pruning phase aims to generalize the DT that was
generated in the growth phase by generating a sub-tree that avoids overtting to the
training data. Most post-pruning methods essentially address post-pruning as if it
were a single-objective problem (i.e., maximize validation accuracy), and address the
issue of simplicity (in terms of the number of leaves) only in the case of a tie. However,
it is well known that apart from accuracy there are other performance measures (e.g.,
stability, simplicity, and interpretability) that are important for evaluating the DT
quality. In this chapter, we propose that multiobjective evaluation be done during the
post-pruning phase in order to select the best sub-tree, and propose a procedure for
obtaining the optimal sub-tree based on user provided preferences and values.
Keywords: Decision Tree, post-pruning, performance measures, multi-criteria
decision analysis, mixed integer programming, analytic hierarchy process
Introduction
e decision tree (DT) induction is one of the most popular data mining (DM)
technique (e.g., Witten and Frank 2000), with applications in various elds. A DT is
a tree structure representation of the given decision problem such that each nonleaf
node is associated with one of the decision variables, each branch from a nonleaf node
is associated with a subset of the values of the corresponding decision variable, and
each leaf node is associated with a value of the target (or dependent) variable. ere
are two main types of DTs: (1) classication trees and (2) regression trees (RTs). For
a classication tree, the target variable takes its values from a discrete domain, and for
each leaf node, the DT associates a probability or each class (i.e., value of the target
variable). e class that is assigned to a given leaf node of the classication tree results
from a form of majority voting in which the winning class is the one that provides
the largest class probability. On the other hand, the target variable of an RT takes its
values from a continuous domain (numeric), and for each leaf, the RT associates the
mean value of the target variable. In this chapter, we will focus on the classication
tree, which is the most commonly used type of DT, and henceforth, in this chapter,
whenever we use the term “decision tree” we will be referring to a classication tree.
ere are two major phases of the DT induction process: the growth phase and
the pruning phase (e.g., Kim and Koehler 1995). e growth phase involves a recur-
sive partitioning of the training data resulting in a DT such that either each leaf
node is associated with a single class or further partitioning of the given leaf would
Illustrative Example ...........................................................................................317
Conclusion ........................................................................................................320
Acknowledgment ..............................................................................................323
References .........................................................................................................323
Appendix ...........................................................................................................325