Data Mining Techniques for Segmentation


In this chapter we focus on the data mining modeling techniques used for segmentation. We will present in detail some of the most popular and efficient clustering algorithms, their settings, strengths, and capabilities, and we will see them in action through a simple example that aims at preparing readers for the real-world applications to be presented in subsequent chapters.

Although clustering algorithms can be directly applied to input data, a recommended preprocessing step is the application of a data reduction technique that can simplify and enhance the segmentation process by removing redundant information. This approach, although optional, is highly recommended, as it adjusts for possible input data intercorrelations, ensuring rich and unbiased segmentation solutions that equally account for all the underlying data dimensions. Therefore, this chapter also presents in detail principal components analysis (PCA), an established data reduction technique typically used for grouping the original fields into meaningful components.


PCA is a statistical technique used to reduce the data of the original input fields. It derives a limited number of compound measures that can efficiently substitute for the original inputs while retaining most of their information.

PCA is based on linear correlations. The concept of linear correlation and the measure of ...

Get Data Mining Techniques in CRM now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.