Chapter 10. Techniques for data mining in an operational warehouse 357
this particular document. There are several IBM Redbooks that address data
mining in an InfoSphere Warehouse environment in detail, including most
recently InfoSphere Warehouse: A Robust Infrastructure for Business
Intelligence, SG24-7813.
For completeness, we provide a summary of the concepts and processes here.
Types of data mining
There are several data mining techniques, and they can be broadly classified into
one of two categories. It is possible for a single technique fall into both categories
depending on the role the technique is playing at a given time.
Discovery techniques
Predictive techniques
We discuss each of these in turn.
Discovery data mining and techniques
Discovery methods are designed to find patterns in the historical data without
any prior knowledge of what those patterns might be. Thus, we must
discover the
patterns organically. Three discovery mining methods supported by InfoSphere
Warehouse directly are listed here:
Clustering
The clustering algorithm groups data records into segments by how similar
they are based on attributes “of interest.” For example, we can choose to
profile our clients by grouping them according to similar purchasing behavior
or demographic attributes to, therefore, introduce more narrowly defined
targeted marketing to specific customer
segments. A clustering method can
discover non-obvious client groupings based on analysis of these
demographic and behavior attributes.
Associations
The association method identifies links (or
associations) among the data
records of individual transactions such as a single retail purchase of multiple
items, for example, in a grocery store. A form of
link analysis, the associations
method is commonly used for
market basket analysis which finds what retail
items tend to be purchased together. This knowledge enables retailers to
tailor their sales and promotions according to understood buyer patterns.
Sequences
Another form of
link analysis, this method finds sequential patterns across
multiple transactions, as in a
sequence of customer events or purchases.
Knowledge of sequential client patterns or behavior can allow retailers, for
example, to tailor the shopping experience for individual customers, such as