Using CHAID stumps when interviewing an SME
In this recipe we will learn how to use the interactive mode of the CHAID Modeling node to explore data. The name stump comes from the idea that we grow just one branch and stop. The exploration will have the goal of answering five questions:
- What variables seem predictive of the target?
- Do the most predictive variables make sense?
- What questions are most useful to pose to the Subject Matter Experts (SMEs) about data quality?
- What is the potential value of the favorite variables of the SMEs?
- What missing data challenges are present in the data?
We will start with a blank stream.
How to do it...
To use CHAID stumps:
- Add a Source node to the stream for the
cup98lrn reduced vars2.txtfile. Ensure that ...