Chapter 17. Classification

This chapter covers

  • Classifying with decision trees
  • Ensemble classification with random forests
  • Creating a support vector machine
  • Evaluating classification accuracy

Data analysts are frequently faced with the need to predict a categorical outcome from a set of predictor variables. Some examples include

  • Predicting whether an individual will repay a loan, given their demographics and financial history
  • Determining whether an ER patient is having a heart attack, based on their symptoms and vital signs
  • Deciding whether an email is spam, given the presence of key words, images, hypertext, header information, and origin

Each of these cases involves the prediction of a binary categorical outcome (good credit ...

Get R in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.