## 7.4 CLASSIFICATION AND REGRESSION TREES

### 7.4.1 Overview

In Chapter 6, decision trees were described as a way of grouping observations based on specific values or ranges of descriptor variables. For example, the tree in Figure 7.19 organizes a set of observations based on the number of cylinders (**Cylinders**) of the car. The tree was constructed using the variable **MPG** (miles per gallon) as the response variable. This variable was used to guide how the tree was constructed, resulting in groupings that characterize car fuel efficiency. The terminal nodes of the tree (A, B, and C) show a partitioning of cars into sets with good (node A), moderate (node B), and poor (node C) fuel efficiencies.

**Figure 7.19**. Decision tree classifying cars

Each terminal node is a mutually exclusive set of observations, that is, there is no overlap between nodes A, B, or C. The criteria for inclusion in each of these nodes are defined by the set of branch points used to partition the data. For example, terminal node B is defined as observations where **Cylinders** are greater or equal to five and **Cylinders** are less than seven.

Decision trees can be used as both classification and regression prediction models. Decision trees that are built to predict a continuous response variable are called *regression trees* and decision trees built to predict a categorical response are called *classification trees.* During the learning ...