Chapter 9. Classification with k-Nearest Neighbors and Naïve Bayes

In Chapter 8, Probability Distributions, Covariance, and Correlation, we examined statistical distributions, covariance, and correlation. In the previous chapter, you learned about regression. Here, we will focus on classification using Naïve Bayes and k-Nearest Neighbors (k-NN). The problem we want to solve, when using both algorithms, is as follows:

  • We have data in which class (the attribute we want to predict) values are known. We call this training data.
  • We have data in which class values are not known (or we pretend we don't know to test that our classifier works, in which case we call this testing data).
  • We want to predict unknown class values using information from data where ...

Get R: Predictive Analysis now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.