Variations of partitioning go by many names and brand names: decision trees, CARTTM, CHAIDTM, C4.5, C5, and others. The technique is often taught as a data mining technique for the following reasons:
• it is useful for exploring relationships without having a good prior model
• it handles large problems easily
• the results are very interpretable
A classic application is where you want to turn a data table of symptoms and diagnoses of a certain illness into a hierarchy of questions. These questions help diagnose new patients more quickly.
The factor columns (Xs) can be either continuous or categorical (nominal or ordinal). If an X is continuous, then the splits (partitions) are created by a cutting value. The sample is ...