Chapter 12
Tell Me Something New: Pattern Discovery and Data Mining
Earlier chapters focus on directed data mining techniques. These techniques are particularly good when the purpose of data mining is to answer a specific question, and the historical data contains examples to help find the best answer. In directed data mining, the answer to the question is “phrased” as a target variable, and the various techniques discover patterns to discern the value of the target.
This chapter, along with the next four, move on to what is perhaps the more challenging side of data mining. Undirected and semi-directed data mining are most applicable when you do not know the question, or when the question does not have a simple answer residing in the data. These situations may occur in several different ways:
- The goal may be to discover something new, such as groups of customers that are similar to each other or products that sell together.
- The goal may be ill-defined, so defining the target is more than half the battle. For instance, determining which credit card customers are revolvers (who keep a large balance), transactors (who use the card and pay the balance every month), or convenience users (who charge up for furniture or a vacation and then pay off the balance over time).
- The goal may be well-defined but historical examples may not help. Fraud detection is one example. Another example is customer-centric forecasting, discussed in Chapter 10 on survival analysis.
With undirected data ...