CHAPTER 24Trees

In Chapter 23, we have introduced the notion of decision stumps and their use within the AdaBoost algorithm. A decision stump is the simplest example of a decision tree. In this chapter, we will cover decision trees in more detail; see Morgan and Sonquist (1963) for illustration of first works in this direction, or Brieman et al. (1984) and Loh (2011). We present methods for both classification and regression. We start with a theoretical introduction and proceed with an implementation in q. The functions we will use to represent trees are based on the concept of treetables, first introduced by Stevan Apter; see Apter (2010).

24.1 INTRODUCTION TO TREES

Let us consider a feature space spanned over the features images. Tree-based methods partition the space into rectangle-shaped domains and then assign a simple model to each domain. The simple model is usually constant and fits over observations belonging to the given domain. This corresponds to CART, or the Classification And Regression Tree approach.

Figure 24.1 illustrates partitioning for a case of images, where images for images. We have ...

Get Machine Learning and Big Data with kdb+/q now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.