Overview of Partition
Variations of partitioning go by many names and brand names: decision trees, CARTTM, CHAIDTM, C4.5, C5, and others. The technique is often taught as a data mining technique because:
• it is useful for exploring relationships without having a good prior model,
• it handles large problems easily, and
• the results are very interpretable.
A classic application is where you want to turn a data table of symptoms and diagnoses of a certain illness into a hierarchy of questions. These question help diagnose new patients more quickly.
The factor columns (X’s) can be either continuous or categorical (nominal or ordinal). If an X is continuous, then the splits (partitions) are created by a cutting value. The sample is divided into ...