Chapter 10Improving Performance

In Chapter 9, we introduced several of the commonly used approaches to evaluating and estimating the future performance of a machine learning. As part of that discussion, we explained the idea behind cross-validation and bootstrapping, which are two of the most popular resampling techniques. We also discussed the limitations of predictive accuracy as the sole measure of model performance and introduced other measures of performance such as kappa, precision, recall, F-measure, sensitivity, specificity, the receiver operating characteristic (ROC) curve, and area under the curve (AUC).

In the previous chapter, to illustrate how model performance evaluation works in R, we used a powerful package called caret. In this chapter, we will continue to rely on some of the functions provided by this package as we look into different techniques for improving the performance of a machine learning model. The techniques we discuss will be based on two main approaches. The first approach is focused on improving performance by optimizing a single model, while the second approach is focused on leveraging the power of several suboptimal models to improve performance.

By the end of this chapter, you will have learned the following:

  • How to improve performance by tuning the parameters of a single machine learning model to make it better
  • How to improve performance by bringing several weak machine learning models together to create a more powerful unit


Get Practical Machine Learning in R now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.