CHAPTER TWO

An Overview of Data Mining Techniques

SUPERVISED MODELING

In supervised modeling, whether for the prediction of an event or for a continuous numeric outcome, the availability of a training dataset with historical data is required. Models learn from past cases. In order for predictive models to associate input data patterns with specific outcomes, it is necessary to present them with cases with known outcomes. This phase is called the training phase. During that phase, the predictive algorithm builds the function that connects the inputs with the target field. Once the relationships are identified and the model is evaluated and proved to be of satisfactory predictive power, the scoring phase follows. New records, for which the outcome values are unknown, are presented to the model and scored accordingly.

Some predictive models such as regression and decision trees are transparent, providing an explanation of their results. Besides prediction, these models can also be used for insight and profiling. They can identify inputs with a significant effect on the target attribute and they can reveal the type and magnitude of the effect. For instance, supervised models can be applied to find the drivers associated with customer satisfaction or attrition. Similarly, supervised models can also supplement traditional reporting techniques in the profiling of the segments of an organization by identifying the differentiating features of each group.

According to the measurement level ...

Get Data Mining Techniques in CRM now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.