Chapter 8Decision Trees

In Chapter 7, we introduced the naïve Bayes classifier as a machine learning approach that uses the probability of prior events to inform the likelihood of a future event. In this chapter, we introduce a different type of classifier known as a decision tree. Instead of using the probability of prior events to predict future events, the decision tree classifier uses a logical tree-like structure to represent the relationship between predictors and a target outcome.

Decision trees are constructed based on a divide-and-conquer approach, where the original dataset is split repeatedly into smaller subsets until each subset is as homogenous as possible. We discuss this recursive partitioning approach in some length in the early part of the chapter. Later in the chapter, we discuss the process of paring back the size of a decision tree to make it more useful to a wider set of use cases. We wrap up the chapter by training a decision tree model in R, discussing the strengths and weaknesses of the approach and working through a use case.

By the end of this chapter, you will have learned the following:

The basic components of a decision tree and how to interpret it
How decision trees are constructed based on the process of recursive partitioning and impurity
Two of the most popular implementations of decision trees and how they differ in terms of how they measure impurity
Why and how decisions trees are pruned
How to build a decision tree classifier in R and how ...

Get Practical Machine Learning in R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Practical Machine Learning in R by Fred Nwanganga, Mike Chapple

Chapter 8Decision Trees

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly