Chapter 9. Classification with k-Nearest Neighbors and Naïve Bayes

In Chapter 8, Probability Distributions, Covariance, and Correlation, we examined statistical distributions, covariance, and correlation. In the previous chapter, you learned about regression. Here, we will focus on classification using Naïve Bayes and k-Nearest Neighbors (k-NN). The problem we want to solve, when using both algorithms, is as follows:

  • We have data in which class (the attribute we want to predict) values are known. We call this training data.
  • We have data in which class values are not known (or we pretend we don't know to test that our classifier works, in which case we call this testing data).
  • We want to predict unknown class values using information from data where ...

Get R: Predictive Analysis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.