9 Unsupervised methods

This chapter covers

  • Using R’s clustering functions to explore data and look for similarities
  • Choosing the right number of clusters
  • Evaluating a cluster
  • Using R’s association rules functions to find patterns of co-occurrence in data
  • Evaluating a set of association rules

In the previous chapter, we covered using the vtreat package to prepare messy real-world data for modeling. In this chapter, we’ll look at methods to discover unknown relationships in data. These methods are called unsupervised methods. With unsupervised methods, there’s no outcome that you’re trying to predict; instead, you want to discover patterns in the data that perhaps you hadn’t previously suspected. For example, you may want to find groups of customers ...

Get Practical Data Science with R, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.